migrate_2_0.asciidoc 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510
  1. [[breaking-changes-2.0]]
  2. == Breaking changes in 2.0
  3. This section discusses the changes that you need to be aware of when migrating
  4. your application to Elasticsearch 2.0.
  5. === Indices API
  6. The <<alias-retrieving, get alias api>> will, by default produce an error response
  7. if a requested index does not exist. This change brings the defaults for this API in
  8. line with the other Indices APIs. The <<multi-index>> options can be used on a request
  9. to change this behavior
  10. `GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
  11. The following deprecated methods have been removed:
  12. * `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead
  13. * `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead
  14. * `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead
  15. * `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead
  16. === Partial fields
  17. Partial fields were deprecated since 1.0.0beta1 in favor of <<search-request-source-filtering,source filtering>>.
  18. === More Like This
  19. The More Like This API and the More Like This Field query have been removed in
  20. favor of the <<query-dsl-mlt-query, More Like This Query>>.
  21. The parameter `percent_terms_to_match` has been removed in favor of
  22. `minimum_should_match`.
  23. === Routing
  24. The default hash function that is used for routing has been changed from djb2 to
  25. murmur3. This change should be transparent unless you relied on very specific
  26. properties of djb2. This will help ensure a better balance of the document counts
  27. between shards.
  28. In addition, the following node settings related to routing have been deprecated:
  29. [horizontal]
  30. `cluster.routing.operation.hash.type`::
  31. This was an undocumented setting that allowed to configure which hash function
  32. to use for routing. `murmur3` is now enforced on new indices.
  33. `cluster.routing.operation.use_type`::
  34. This was an undocumented setting that allowed to take the `_type` of the
  35. document into account when computing its shard (default: `false`). `false` is
  36. now enforced on new indices.
  37. === Async replication
  38. The `replication` parameter has been removed from all CRUD operations (index,
  39. update, delete, bulk, delete-by-query). These operations are now synchronous
  40. only, and a request will only return once the changes have been replicated to
  41. all active shards in the shard group.
  42. === Store
  43. The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0.
  44. === Term Vectors API
  45. Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`.
  46. === Script fields
  47. Script fields in 1.x were only returned as a single value. So even if the return
  48. value of a script used to be list, it would be returned as an array containing
  49. a single value that is a list too, such as:
  50. [source,json]
  51. ---------------
  52. "fields": {
  53. "my_field": [
  54. [
  55. "v1",
  56. "v2"
  57. ]
  58. ]
  59. }
  60. ---------------
  61. In elasticsearch 2.x, scripts that return a list of values are considered as
  62. multivalued fields. So the same example would return the following response,
  63. with values in a single array.
  64. [source,json]
  65. ---------------
  66. "fields": {
  67. "my_field": [
  68. "v1",
  69. "v2"
  70. ]
  71. }
  72. ---------------
  73. === Main API
  74. Previously, calling `GET /` was giving back the http status code within the json response
  75. in addition to the actual HTTP status code. We removed `status` field in json response.
  76. === Java API
  77. `org.elasticsearch.index.queries.FilterBuilders` has been removed as part of the merge of
  78. queries and filters. These filters are now available in `QueryBuilders` with the same name.
  79. All methods that used to accept a `FilterBuilder` now accept a `QueryBuilder` instead.
  80. In addition some query builders have been removed or renamed:
  81. * `commonTerms(...)` renamed with `commonTermsQuery(...)`
  82. * `queryString(...)` renamed with `queryStringQuery(...)`
  83. * `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
  84. * `textPhrase(...)` removed
  85. * `textPhrasePrefix(...)` removed
  86. * `textPhrasePrefixQuery(...)` removed
  87. * `filtered(...)` removed. Use `filteredQuery(...)` instead.
  88. * `inQuery(...)` removed.
  89. === Aggregations
  90. The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly
  91. the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`,
  92. and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations
  93. regardless of the type of data being queried. To support this some changes were made in the `MultiBucketAggregation` interface which applies
  94. to all bucket aggregations:
  95. * The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested
  96. (e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`).
  97. * A `getKeyAsString()` method has been added to return the String representation of the key.
  98. * All other `getKeyAsX()` methods have been removed.
  99. * The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations.
  100. The `histogram` and the `date_histogram` aggregation now support a simplified `offset` option that replaces the previous `pre_offset` and
  101. `post_offset` rounding options. Instead of having to specify two separate offset shifts of the underlying buckets, the `offset` option
  102. moves the bucket boundaries in positive or negative direction depending on its argument.
  103. The `date_histogram` options for `pre_zone` and `post_zone` are replaced by the `time_zone` option. The behavior of `time_zone` is
  104. equivalent to the former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now will lead to the bucket calculations
  105. being applied in the specified time zone but In addition to this, also the `pre_zone_adjust_large_interval` is removed because we
  106. now always return dates and bucket keys in UTC.
  107. Both the `histogram` and `date_histogram` aggregations now have a default `min_doc_count` of `0` instead of `1` previously.
  108. `include`/`exclude` filtering on the `terms` aggregation now uses the same syntax as regexp queries instead of the Java syntax. While simple
  109. regexps should still work, more complex ones might need some rewriting. Also, the `flags` parameter is not supported anymore.
  110. === Terms filter lookup caching
  111. The terms filter lookup mechanism does not support the `cache` option anymore
  112. and relies on the filesystem cache instead. If the lookup index is not too
  113. large, it is recommended to make it replicated to all nodes by setting
  114. `index.auto_expand_replicas: 0-all` in order to remove the network overhead as
  115. well.
  116. === Delete by query
  117. The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`,
  118. `successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica
  119. shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards
  120. of an index. The http status code is left unchanged and is only based on failures that occurred while executing on
  121. primary shards.
  122. === Delete api with missing routing when required
  123. Delete api requires a routing value when deleting a document belonging to a type that has routing set to required in its
  124. mapping, whereas previous elasticsearch versions would trigger a broadcast delete on all shards belonging to the index.
  125. A `RoutingMissingException` is now thrown instead.
  126. === Mappings
  127. * The setting `index.mapping.allow_type_wrapper` has been removed. Documents should always be sent without the type as the root element.
  128. * The delete mappings API has been removed. Mapping types can no longer be deleted.
  129. ==== Removed type prefix on field names in queries
  130. Types can no longer be specified on fields within queries. Instead, specify type restrictions in the search request.
  131. The following is an example query in 1.x over types `t1` and `t2`:
  132. [source,json]
  133. ---------------
  134. curl -XGET 'localhost:9200/index/_search'
  135. {
  136. "query": {
  137. "bool": {
  138. "should": [
  139. {"match": { "t1.field_only_in_t1": "foo" }},
  140. {"match": { "t2.field_only_in_t2": "bar" }}
  141. ]
  142. }
  143. }
  144. }
  145. ---------------
  146. In 2.0, the query should look like the following:
  147. [source,json]
  148. ---------------
  149. curl -XGET 'localhost:9200/index/t1,t2/_search'
  150. {
  151. "query": {
  152. "bool": {
  153. "should": [
  154. {"match": { "field_only_in_t1": "foo" }},
  155. {"match": { "field_only_in_t2": "bar" }}
  156. ]
  157. }
  158. }
  159. }
  160. ---------------
  161. ==== Removed short name field access
  162. Field names in queries, aggregations, etc. must now use the complete name. Use of the short name
  163. caused ambiguities in field lookups when the same name existed within multiple object mappings.
  164. The following example illustrates the difference between 1.x and 2.0.
  165. Given these mappings:
  166. [source,json]
  167. ---------------
  168. curl -XPUT 'localhost:9200/index'
  169. {
  170. "mappings": {
  171. "type": {
  172. "properties": {
  173. "name": {
  174. "type": "object",
  175. "properties": {
  176. "first": {"type": "string"},
  177. "last": {"type": "string"}
  178. }
  179. }
  180. }
  181. }
  182. }
  183. }
  184. ---------------
  185. The following query was possible in 1.x:
  186. [source,json]
  187. ---------------
  188. curl -XGET 'localhost:9200/index/type/_search'
  189. {
  190. "query": {
  191. "match": { "first": "foo" }
  192. }
  193. }
  194. ---------------
  195. In 2.0, the same query should now be:
  196. [source,json]
  197. ---------------
  198. curl -XGET 'localhost:9200/index/type/_search'
  199. {
  200. "query": {
  201. "match": { "name.first": "foo" }
  202. }
  203. }
  204. ---------------
  205. ==== Meta fields have limited configuration
  206. Meta fields (those beginning with underscore) are fields used by elasticsearch
  207. to provide special features. They now have limited configuration options.
  208. * `_id` configuration can no longer be changed. If you need to sort, use `_uid` instead.
  209. * `_type` configuration can no longer be changed.
  210. * `_index` configuration is limited to enabling the field.
  211. * `_routing` configuration is limited to requiring the field.
  212. * `_boost` has been removed.
  213. * `_field_names` configuration is limited to disabling the field.
  214. * `_size` configuration is limited to enabling the field.
  215. ==== Source field limitations
  216. The `_source` field could previously be disabled dynamically. Since this field
  217. is a critical piece of many features like the Update API, it is no longer
  218. possible to disable.
  219. The options for `compress` and `compress_threshold` have also been removed.
  220. The source field is already compressed. To minimize the storage cost,
  221. set `index.codec: best_compression` in index settings.
  222. ==== Boolean fields
  223. Boolean fields used to have a string fielddata with `F` meaning `false` and `T`
  224. meaning `true`. They have been refactored to use numeric fielddata, with `0`
  225. for `false` and `1` for `true`. As a consequence, the format of the responses of
  226. the following APIs changed when applied to boolean fields: `0`/`1` is returned
  227. instead of `F`/`T`:
  228. - <<search-request-fielddata-fields,fielddata fields>>
  229. - <<search-request-sort,sort values>>
  230. - <<search-aggregations-bucket-terms-aggregation,terms aggregations>>
  231. In addition, terms aggregations use a custom formatter for boolean (like for
  232. dates and ip addresses, which are also backed by numbers) in order to return
  233. the user-friendly representation of boolean fields: `false`/`true`:
  234. [source,json]
  235. ---------------
  236. "buckets": [
  237. {
  238. "key": 0,
  239. "key_as_string": "false",
  240. "doc_count": 42
  241. },
  242. {
  243. "key": 1,
  244. "key_as_string": "true",
  245. "doc_count": 12
  246. }
  247. ]
  248. ---------------
  249. ==== Murmur3 Fields
  250. Fields of type `murmur3` can no longer change `doc_values` or `index` setting.
  251. They are always stored with doc values, and not indexed.
  252. ==== Source field configuration
  253. The `_source` field no longer supports `includes` and `excludes` parameters. When
  254. `_source` is enabled, the entire original source will be stored.
  255. ==== Config based mappings
  256. The ability to specify mappings in configuration files has been removed. To specify
  257. default mappings that apply to multiple indexes, use index templates.
  258. The following settings are no longer valid:
  259. * `index.mapper.default_mapping_location`
  260. * `index.mapper.default_percolator_mapping_location`
  261. === Codecs
  262. It is no longer possible to specify per-field postings and doc values formats
  263. in the mappings. This setting will be ignored on indices created before
  264. elasticsearch 2.0 and will cause mapping parsing to fail on indices created on
  265. or after 2.0. For old indices, this means that new segments will be written
  266. with the default postings and doc values formats of the current codec.
  267. It is still possible to change the whole codec by using the `index.codec`
  268. setting. Please however note that using a non-default codec is discouraged as
  269. it could prevent future versions of Elasticsearch from being able to read the
  270. index.
  271. === Scripting settings
  272. Removed support for `script.disable_dynamic` node setting, replaced by
  273. fine-grained script settings described in the <<enable-dynamic-scripting,scripting docs>>.
  274. The following setting previously used to enable dynamic scripts:
  275. [source,yaml]
  276. ---------------
  277. script.disable_dynamic: false
  278. ---------------
  279. can be replaced with the following two settings in `elasticsearch.yml` that
  280. achieve the same result:
  281. [source,yaml]
  282. ---------------
  283. script.inline: on
  284. script.indexed: on
  285. ---------------
  286. === Script parameters
  287. Deprecated script parameters `id`, `file`, and `scriptField` have been removed
  288. from all scriptable APIs. `script_id`, `script_file` and `script` should be used
  289. in their place.
  290. === Groovy scripts sandbox
  291. The groovy sandbox and related settings have been removed. Groovy is now a non
  292. sandboxed scripting language, without any option to turn the sandbox on.
  293. === Plugins making use of scripts
  294. Plugins that make use of scripts must register their own script context through
  295. `ScriptModule`. Script contexts can be used as part of fine-grained settings to
  296. enable/disable scripts selectively.
  297. === Thrift and memcached transport
  298. The thrift and memcached transport plugins are no longer supported. Instead, use
  299. either the HTTP transport (enabled by default) or the node or transport Java client.
  300. === `search_type=count` deprecation
  301. The `count` search type has been deprecated. All benefits from this search type can
  302. now be achieved by using the `query_then_fetch` search type (which is the
  303. default) and setting `size` to `0`.
  304. === JSONP support
  305. JSONP callback support has now been removed. CORS should be used to access Elasticsearch
  306. over AJAX instead:
  307. [source,yaml]
  308. ---------------
  309. http.cors.enabled: true
  310. http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
  311. ---------------
  312. === Cluster state REST api
  313. The cluster state api doesn't return the `routing_nodes` section anymore when
  314. `routing_table` is requested. The newly introduced `routing_nodes` flag can
  315. be used separately to control whether `routing_nodes` should be returned.
  316. === Query DSL
  317. Change to ranking behaviour: single-term queries on numeric fields now score in the same way as string fields (use of IDF, norms if enabled).
  318. Previously, term queries on numeric fields were deliberately prevented from using the usual Lucene scoring logic and this behaviour was undocumented and, to some, unexpected.
  319. If the introduction of scoring to numeric fields is undesirable for your query clauses the fix is simple: wrap them in a `constant_score` or use a `filter` expression instead.
  320. The `fuzzy_like_this` and `fuzzy_like_this_field` queries have been removed.
  321. The `limit` filter is deprecated and becomes a no-op. You can achieve similar
  322. behaviour using the <<search-request-body,terminate_after>> parameter.
  323. `or` and `and` on the one hand and `bool` on the other hand used to have
  324. different performance characteristics depending on the wrapped filters. This is
  325. fixed now, as a consequence the `or` and `and` filters are now deprecated in
  326. favour or `bool`.
  327. The `execution` option of the `terms` filter is now deprecated and ignored if
  328. provided.
  329. The `_cache` and `_cache_key` parameters of filters are deprecated in the REST
  330. layer and removed in the Java API. In case they are specified they will be
  331. ignored. Instead filters are always used as their own cache key and elasticsearch
  332. makes decisions by itself about whether it should cache filters based on how
  333. often they are used.
  334. ==== Query/filter merge
  335. Elasticsearch no longer makes a difference between queries and filters in the
  336. DSL; it detects when scores are not needed and automatically optimizes the
  337. query to not compute scores and optionally caches the result.
  338. As a consequence the `query` filter serves no purpose anymore and is deprecated.
  339. === Snapshot and Restore
  340. The obsolete parameters `expand_wildcards_open` and `expand_wildcards_close` are no longer
  341. supported by the snapshot and restore operations. These parameters have been replaced by
  342. a single `expand_wildcards` parameter. See <<multi-index,the multi-index docs>> for more.
  343. === `_shutdown` API
  344. The `_shutdown` API has been removed without a replacement. Nodes should be managed via operating
  345. systems and the provided start/stop scripts.
  346. === Analyze API
  347. The Analyze API return 0 as first Token's position instead of 1.
  348. === Multiple data.path striping
  349. Previously, if the `data.path` setting listed multiple data paths, then a
  350. shard would be ``striped'' across all paths by writing a whole file to each
  351. path in turn (in accordance with the `index.store.distributor` setting). The
  352. result was that the files from a single segment in a shard could be spread
  353. across multiple disks, and the failure of any one disk could corrupt multiple
  354. shards.
  355. This striping is no longer supported. Instead, different shards may be
  356. allocated to different paths, but all of the files in a single shard will be
  357. written to the same path.
  358. If striping is detected while starting Elasticsearch 2.0.0 or later, all of
  359. the files belonging to the same shard will be migrated to the same path. If
  360. there is not enough disk space to complete this migration, the upgrade will be
  361. cancelled and can only be resumed once enough disk space is made available.
  362. The `index.store.distributor` setting has also been removed.
  363. === Hunspell dictionary configuration
  364. The parameter `indices.analysis.hunspell.dictionary.location` has been removed,
  365. and `<path.conf>/hunspell` is always used.
  366. === Java API Transport API construction
  367. The `TransportClient` construction code has changed, it now uses the builder
  368. pattern. Instead of using:
  369. [source,java]
  370. --------------------------------------------------
  371. Settings settings = ImmutableSettings.settingsBuilder()
  372. .put("cluster.name", "myClusterName").build();
  373. Client client = new TransportClient(settings);
  374. --------------------------------------------------
  375. Use:
  376. [source,java]
  377. --------------------------------------------------
  378. Settings settings = ImmutableSettings.settingsBuilder()
  379. .put("cluster.name", "myClusterName").build();
  380. Client client = TransportClient.builder().settings(settings).build();
  381. --------------------------------------------------