migrate_2_0.asciidoc 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292
  1. [[breaking-changes-2.0]]
  2. == Breaking changes in 2.0
  3. This section discusses the changes that you need to be aware of when migrating
  4. your application to Elasticsearch 2.0.
  5. === Indices API
  6. The <<alias-retrieving, get alias api>> will, by default produce an error response
  7. if a requested index does not exist. This change brings the defaults for this API in
  8. line with the other Indices APIs. The <<multi-index>> options can be used on a request
  9. to change this behavior
  10. `GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
  11. The following deprecated methods have been removed:
  12. * `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead
  13. * `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead
  14. * `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead
  15. * `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead
  16. === Partial fields
  17. Partial fields were deprecated since 1.0.0beta1 in favor of <<search-request-source-filtering,source filtering>>.
  18. === More Like This Field
  19. The More Like This Field query has been removed in favor of the <<query-dsl-mlt-query, More Like This Query>>
  20. restrained set to a specific `field`.
  21. === Routing
  22. The default hash function that is used for routing has been changed from djb2 to
  23. murmur3. This change should be transparent unless you relied on very specific
  24. properties of djb2. This will help ensure a better balance of the document counts
  25. between shards.
  26. In addition, the following node settings related to routing have been deprecated:
  27. [horizontal]
  28. `cluster.routing.operation.hash.type`::
  29. This was an undocumented setting that allowed to configure which hash function
  30. to use for routing. `murmur3` is now enforced on new indices.
  31. `cluster.routing.operation.use_type`::
  32. This was an undocumented setting that allowed to take the `_type` of the
  33. document into account when computing its shard (default: `false`). `false` is
  34. now enforced on new indices.
  35. === Async replication
  36. The `replication` parameter has been removed from all CRUD operations (index,
  37. update, delete, bulk, delete-by-query). These operations are now synchronous
  38. only, and a request will only return once the changes have been replicated to
  39. all active shards in the shard group.
  40. === Store
  41. The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0.
  42. === Term Vectors API
  43. Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`.
  44. === Script fields
  45. Script fields in 1.x were only returned as a single value. So even if the return
  46. value of a script used to be list, it would be returned as an array containing
  47. a single value that is a list too, such as:
  48. [source,json]
  49. ---------------
  50. "fields": {
  51. "my_field": [
  52. [
  53. "v1",
  54. "v2"
  55. ]
  56. ]
  57. }
  58. ---------------
  59. In elasticsearch 2.x, scripts that return a list of values are considered as
  60. multivalued fields. So the same example would return the following response,
  61. with values in a single array.
  62. [source,json]
  63. ---------------
  64. "fields": {
  65. "my_field": [
  66. "v1",
  67. "v2"
  68. ]
  69. }
  70. ---------------
  71. === Main API
  72. Previously, calling `GET /` was giving back the http status code within the json response
  73. in addition to the actual HTTP status code. We removed `status` field in json response.
  74. === Java API
  75. Some query builders have been removed or renamed:
  76. * `commonTerms(...)` renamed with `commonTermsQuery(...)`
  77. * `queryString(...)` renamed with `queryStringQuery(...)`
  78. * `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
  79. * `textPhrase(...)` removed
  80. * `textPhrasePrefix(...)` removed
  81. * `textPhrasePrefixQuery(...)` removed
  82. * `filtered(...)` removed. Use `filteredQuery(...)` instead.
  83. * `inQuery(...)` removed.
  84. === Aggregations
  85. The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly
  86. the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`,
  87. and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations
  88. regardless of the type of data being queried. To support this some changes were made in the `MultiBucketAggregation` interface which applies
  89. to all bucket aggregations:
  90. * The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested
  91. (e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`).
  92. * A `getKeyAsString()` method has been added to return the String representation of the key.
  93. * All other `getKeyAsX()` methods have been removed.
  94. * The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations.
  95. The `histogram` and the `date_histogram` aggregation now support a simplified `offset` option that replaces the previous `pre_offset` and
  96. `post_offset` rounding options. Instead of having to specify two separate offset shifts of the underlying buckets, the `offset` option
  97. moves the bucket boundaries in positive or negative direction depending on its argument.
  98. The `date_histogram` options for `pre_zone` and `post_zone` are replaced by the `time_zone` option. The behavior of `time_zone` is
  99. equivalent to the former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now will lead to the bucket calculations
  100. being applied in the specified time zone but In addition to this, also the `pre_zone_adjust_large_interval` is removed because we
  101. now always return dates and bucket keys in UTC.
  102. === Terms filter lookup caching
  103. The terms filter lookup mechanism does not support the `cache` option anymore
  104. and relies on the filesystem cache instead. If the lookup index is not too
  105. large, it is recommended to make it replicated to all nodes by setting
  106. `index.auto_expand_replicas: 0-all` in order to remove the network overhead as
  107. well.
  108. === Parent parameter on update request
  109. The `parent` parameter has been removed from the update request. Before 2.x it just set the routing parameter. The
  110. `routing` setting should be used instead. The `parent` setting was confusing, because it had the impression that the parent
  111. a child documents points to can be changed but this is not true.
  112. === Delete by query
  113. The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`,
  114. `successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica
  115. shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards
  116. of an index. The http status code is left unchanged and is only based on failures that occurred while executing on
  117. primary shards.
  118. === Mappings
  119. * The setting `index.mapping.allow_type_wrapper` has been removed. Documents should always be sent without the type as the root element.
  120. ==== Removed type prefix on field names in queries
  121. Types can no longer be specified on fields within queries. Instead, specify type restrictions in the search request.
  122. The following is an example query in 1.x over types `t1` and `t2`:
  123. [source,json]
  124. ---------------
  125. curl -XGET 'localhost:9200/index/_search'
  126. {
  127. "query": {
  128. "bool": {
  129. "should": [
  130. {"match": { "t1.field_only_in_t1": "foo" }},
  131. {"match": { "t2.field_only_in_t2": "bar" }}
  132. ]
  133. }
  134. }
  135. }
  136. ---------------
  137. In 2.0, the query should look like the following:
  138. [source,json]
  139. ---------------
  140. curl -XGET 'localhost:9200/index/t1,t2/_search'
  141. {
  142. "query": {
  143. "bool": {
  144. "should": [
  145. {"match": { "field_only_in_t1": "foo" }},
  146. {"match": { "field_only_in_t2": "bar" }}
  147. ]
  148. }
  149. }
  150. }
  151. ---------------
  152. ==== Removed short name field access
  153. Field names in queries, aggregations, etc. must now use the complete name. Use of the short name
  154. caused ambiguities in field lookups when the same name existed within multiple object mappings.
  155. The following example illustrates the difference between 1.x and 2.0.
  156. Given these mappings:
  157. [source,json]
  158. ---------------
  159. curl -XPUT 'localhost:9200/index'
  160. {
  161. "mappings": {
  162. "type": {
  163. "properties": {
  164. "name": {
  165. "type": "object",
  166. "properties": {
  167. "first": {"type": "string"},
  168. "last": {"type": "string"}
  169. }
  170. }
  171. }
  172. }
  173. }
  174. }
  175. ---------------
  176. The following query was possible in 1.x:
  177. [source,json]
  178. ---------------
  179. curl -XGET 'localhost:9200/index/type/_search'
  180. {
  181. "query": {
  182. "match": { "first": "foo" }
  183. }
  184. }
  185. ---------------
  186. In 2.0, the same query should now be:
  187. [source,json]
  188. ---------------
  189. curl -XGET 'localhost:9200/index/type/_search'
  190. {
  191. "query": {
  192. "match": { "name.first": "foo" }
  193. }
  194. }
  195. ---------------
  196. ==== Meta fields have limited configuration
  197. Meta fields (those beginning with underscore) are fields used by elasticsearch
  198. to provide special features. They now have limited configuration options.
  199. * `_id` configuration can no longer be changed. If you need to sort, use `_uid` instead.
  200. * `_type` configuration can no longer be changed.
  201. * `_index` configuration is limited to enabling the field.
  202. * `_routing` configuration is limited to requiring the field.
  203. * `_boost` has been removed.
  204. * `_field_names` configuration is limited to disabling the field.
  205. * `_size` configuration is limited to enabling the field.
  206. === Codecs
  207. It is no longer possible to specify per-field postings and doc values formats
  208. in the mappings. This setting will be ignored on indices created before
  209. elasticsearch 2.0 and will cause mapping parsing to fail on indices created on
  210. or after 2.0. For old indices, this means that new segments will be written
  211. with the default postings and doc values formats of the current codec.
  212. It is still possible to change the whole codec by using the `index.codec`
  213. setting. Please however note that using a non-default codec is discouraged as
  214. it could prevent future versions of Elasticsearch from being able to read the
  215. index.
  216. === Scripts
  217. Deprecated script parameters `id`, `file`, and `scriptField` have been removed
  218. from all scriptable APIs. `script_id`, `script_file` and `script` should be used
  219. in their place.
  220. === Thrift and memcached transport
  221. The thrift and memcached transport plugins are no longer supported. Instead, use
  222. either the HTTP transport (enabled by default) or the node or transport Java client.