1
0
Эх сурвалжийг харах

Reworked 5.0 breaking changes docs

Clinton Gormley 9 жил өмнө
parent
commit
5c845f8bb5

+ 31 - 856
docs/reference/migration/migrate_5_0.asciidoc

@@ -4,877 +4,52 @@
 This section discusses the changes that you need to be aware of when migrating
 your application to Elasticsearch 5.0.
 
+[IMPORTANT]
+.Reindex indices from Elasticseach 1.x or before
+=========================================
+
+Indices created in Elasticsearch 1.x or before will need to be reindexed with
+Elasticsearch 2.x in order to be readable by Elasticsearch 5.x. The easiest
+way to do this is to upgrade to Elasticsearch 2.3 or later and to use the
+`reindex` API.
+
+=========================================
+
+[float]
+=== Also see:
+
 * <<breaking_50_search_changes>>
+* <<breaking_50_mapping_changes>>
+* <<breaking_50_percolator>>
+* <<breaking_50_index_apis>>
+* <<breaking_50_settings_changes>>
+* <<breaking_50_allocation>>
 * <<breaking_50_rest_api_changes>>
 * <<breaking_50_cat_api>>
-* <<breaking_50_parent_child_changes>>
-* <<breaking_50_settings_changes>>
-* <<breaking_50_mapping_changes>>
-* <<breaking_50_plugins>>
 * <<breaking_50_java_api_changes>>
-* <<breaking_50_cache_concurrency>>
-* <<breaking_50_non_loopback>>
-* <<breaking_50_thread_pool>>
-* <<breaking_50_allocation>>
-* <<breaking_50_percolator>>
 * <<breaking_50_packaging>>
-* <<breaking_50_scripting>>
-* <<breaking_50_term_vectors>>
-* <<breaking_50_security>>
-* <<breaking_50_snapshot_restore>>
-
-[[breaking_50_search_changes]]
-=== Warmers
-
-Thanks to several changes like doc values by default or disk-based norms,
-warmers have become quite useless. As a consequence, warmers and the warmer
-API have been removed: it is not possible anymore to register queries that
-will run before a new IndexSearcher is published.
-
-Don't worry if you have warmers defined on your indices, they will simply be
-ignored when upgrading to 5.0.
-
-=== Search changes
-
-==== `search_type=count` removed
-
-The `count` search type was deprecated since version 2.0.0 and is now removed.
-In order to get the same benefits, you just need to set the value of the `size`
-parameter to `0`.
-
-For instance, the following request:
-
-[source,sh]
----------------
-GET /my_index/_search?search_type=count
-{
-  "aggs": {
-    "my_terms": {
-       "terms": {
-         "field": "foo"
-       }
-     }
-  }
-}
----------------
-
-can be replaced with:
-
-[source,sh]
----------------
-GET /my_index/_search
-{
-  "size": 0,
-  "aggs": {
-    "my_terms": {
-       "terms": {
-         "field": "foo"
-       }
-     }
-  }
-}
----------------
-
-==== `search_type=scan` removed
-
-The `scan` search type was deprecated since version 2.1.0 and is now removed.
-All benefits from this search type can now be achieved by doing a scroll
-request that sorts documents in `_doc` order, for instance:
-
-[source,sh]
----------------
-GET /my_index/_search?scroll=2m
-{
-  "sort": [
-    "_doc"
-  ]
-}
----------------
-
-Scroll requests sorted by `_doc` have been optimized to more efficiently resume
-from where the previous request stopped, so this will have the same performance
-characteristics as the former `scan` search type.
-
-==== Boost accuracy for queries on `_all`
-
-Per-field boosts on the `_all` are now compressed on a single byte instead of
-4 bytes previously. While this will make the index more space-efficient, this
-also means that the boosts will be less accurately encoded.
-
-[[breaking_50_rest_api_changes]]
-=== REST API changes
-
-==== id values longer than 512 bytes are rejected
-
-When specifying an `_id` value longer than 512 bytes, the request will be
-rejected.
-
-==== search exists api removed
-
-The search exists api has been removed in favour of using the search api with
-`size` set to `0` and `terminate_after` set to `1`.
-
-==== `/_optimize` endpoint removed
-
-The deprecated `/_optimize` endpoint has been removed. The `/_forcemerge`
-endpoint should be used in lieu of optimize.
-
-The `GET` HTTP verb for `/_forcemerge` is no longer supported, please use the
-`POST` HTTP verb.
-
-==== Deprecated queries removed
-
-The following deprecated queries have been removed:
-
-* `filtered`: use `bool` query instead, which supports `filter` clauses too
-* `and`: use `must` clauses in a `bool` query instead
-* `or`: use should clauses in a `bool` query instead
-* `limit`: use `terminate_after` parameter instead
-* `fquery`: obsolete after filters and queries have been merged
-* `query`: obsolete after filters and queries have been merged
-
-==== Unified fuzziness parameter
-
-* Removed support for the deprecated `min_similarity` parameter in `fuzzy query`, in favour of `similarity`.
-* Removed support for the deprecated `fuzzy_min_sim` parameter in `query_string` query, in favour of `similarity`.
-* Removed support for the deprecated `edit_distance` parameter in completion suggester, in favour of `similarity`.
-
-==== indices query
-
-Removed support for the deprecated `filter` and `no_match_filter` fields in `indices` query,
-in favour of `query` and `no_match_query`.
-
-==== nested query
-
-Removed support for the deprecated `filter` fields in `nested` query, in favour of `query`.
-
-==== terms query
-
-Removed support for the deprecated `minimum_should_match` and `disable_coord` in `terms` query, use `bool` query instead.
-Removed also support for the deprecated `execution` parameter.
-
-==== function_score query
-
-Removed support for the top level `filter` element in `function_score` query, replaced by `query`.
-
-==== highlighters
-
-Removed support for multiple highlighter names, the only supported ones are: `plain`, `fvh` and `postings`.
-
-==== top level filter
-
-Removed support for the deprecated top level `filter` in the search api, replaced by `post_filter`.
-
-==== `query_binary` and `filter_binary` removed
-
-Removed support for the undocumented `query_binary` and `filter_binary` sections of a search request.
-
-==== `span_near`'s' `collect_payloads` deprecated
-
-Payloads are now loaded when needed.
-
-[[breaking_50_cat_api]]
-=== CAT API changes
-
-==== Use Accept header for specifying response media type
-
-Previous versions of Elasticsearch accepted the Content-type header
-field for controlling the media type of the response in the cat API.
-This is in opposition to the HTTP spec which specifies the Accept
-header field for this purpose. Elasticsearch now uses the Accept header
-field and support for using the Content-Type header field for this
-purpose has been removed.
-
-==== Host field removed from the cat nodes API
-
-The `host` field has been removed from the cat nodes API as its value
-is always equal to the `ip` field. The `name` field is available in the
-cat nodes API and should be used instead of the `host` field.
-
-==== Changes to cat recovery API
-
-The fields `bytes_recovered` and `files_recovered` have been added to
-the cat recovery API. These fields, respectively, indicate the total
-number of bytes and files that have been recovered.
-
-The fields `total_files` and `total_bytes` have been renamed to
-`files_total` and `bytes_total`, respectively.
-
-Additionally, the field `translog` has been renamed to
-`translog_ops_recovered`, the field `translog_total` to
-`translog_ops` and the field `translog_percent` to
-`translog_ops_percent`. The short aliases for these fields are `tor`,
-`to`, and `top`, respectively.
-
-[[breaking_50_parent_child_changes]]
-=== Parent/Child changes
-
-The `children` aggregation, parent child inner hits and `has_child` and `has_parent` queries will not work on indices
-with `_parent` field mapping created before version `2.0.0`. The data of these indices need to be re-indexed into a new index.
-
-The format of the join between parent and child documents have changed with the `2.0.0` release. The old
-format can't read from version `5.0.0` and onwards. The new format allows for a much more efficient and
-scalable join between parent and child documents and the join data structures are stored on disk
-data structures as opposed as before the join data structures were stored in the jvm heap space.
-
-==== `score_type` has been removed
-
-The `score_type` option has been removed from the `has_child` and `has_parent` queries in favour of the `score_mode` option
-which does the exact same thing.
-
-==== `sum` score mode removed
-
-The `sum` score mode has been removed in favour of the `total` mode which does the same and is already available in
-previous versions.
-
-==== `max_children` option
-
-When `max_children` was set to `0` on the `has_child` query then there was no upper limit on how many children documents
-are allowed to match. This has changed and `0` now really means to zero child documents are allowed. If no upper limit
-is needed then the `max_children` option shouldn't be defined at all on the `has_child` query.
-
-==== `_parent` field no longer indexed
-
-The join between parent and child documents no longer relies on indexed fields and therefor from `5.0.0` onwards
-the `_parent` indexed field won't be indexed. In order to find documents that referrer to a specific parent id
-the new `parent_id` query can be used. The get response and hits inside the search response remain to include
-the parent id under the `_parent` key.
-
-[[breaking_50_settings_changes]]
-=== Settings changes
-
-From Elasticsearch 5.0 on all settings are validated before they are applied. Node level and default index
-level settings are validated on node startup, dynamic cluster and index setting are validated before they are updated/added
-to the cluster state. Every setting must be a _known_ setting or in other words all settings must be registered with the
-node or transport client they are used with. This implies that plugins that define custom settings must register all of their
-settings during pluging loading using the `SettingsModule#registerSettings(Setting)` method.
-
-==== Node settings
-
-The `name` setting has been removed and is replaced by `node.name`. Usage of `-Dname=some_node_name` is not supported
-anymore.
-
-==== Transport Settings
-
-All settings with a `netty` infix have been replaced by their already existing `transport` synonyms. For instance `transport.netty.bind_host` is
-no longer supported and should be replaced by the superseding setting `transport.bind_host`.
-
-==== Analysis settings
-
-The `index.analysis.analyzer.default_index` analyzer is not supported anymore.
-If you wish to change the analyzer to use for indexing, change the
-`index.analysis.analyzer.default` analyzer instead.
-
-==== Ping timeout settings
-
-Previously, there were three settings for the ping timeout: `discovery.zen.initial_ping_timeout`,
-`discovery.zen.ping.timeout` and `discovery.zen.ping_timeout`. The former two have been removed and
-the only setting key for the ping timeout is now `discovery.zen.ping_timeout`. The default value for
-ping timeouts remains at three seconds.
-
-==== Recovery settings
-
-Recovery settings deprecated in 1.x have been removed:
-
- * `index.shard.recovery.translog_size` is superseded by `indices.recovery.translog_size`
- * `index.shard.recovery.translog_ops` is superseded by `indices.recovery.translog_ops`
- * `index.shard.recovery.file_chunk_size` is superseded by `indices.recovery.file_chunk_size`
- * `index.shard.recovery.concurrent_streams` is superseded by `indices.recovery.concurrent_streams`
- * `index.shard.recovery.concurrent_small_file_streams` is superseded by `indices.recovery.concurrent_small_file_streams`
- * `indices.recovery.max_size_per_sec` is superseded by `indices.recovery.max_bytes_per_sec`
-
-If you are using any of these settings please take the time and review their purpose. All of the settings above are considered
-_expert settings_ and should only be used if absolutely necessary. If you have set any of the above setting as persistent
-cluster settings please use the settings update API and set their superseded keys accordingly.
-
-The following settings have been removed without replacement
-
- * `indices.recovery.concurrent_small_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
- * `indices.recovery.concurrent_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
-
-==== Translog settings
-
-The `index.translog.flush_threshold_ops` setting is not supported anymore. In order to control flushes based on the transaction log
-growth use `index.translog.flush_threshold_size` instead. Changing the translog type with `index.translog.fs.type` is not supported
-anymore, the `buffered` implementation is now the only available option and uses a fixed `8kb` buffer.
-
-The translog by default is fsynced on a request basis such that the ability to fsync on every operation is not necessary anymore. In-fact it can
-be a performance bottleneck and it's trappy since it enabled by a special value set on `index.translog.sync_interval`. `index.translog.sync_interval`
-now doesn't accept a value less than `100ms` which prevents fsyncing too often if async durability is enabled. The special value `0` is not supported anymore.
-
-==== Request Cache Settings
-
-The deprecated settings `index.cache.query.enable` and `indices.cache.query.size` have been removed and are replaced with
-`index.requests.cache.enable` and `indices.requests.cache.size` respectively.
-
-`indices.requests.cache.clean_interval` has been replaced with `indices.cache.clean_interval` and is no longer supported.
-
-==== Field Data Cache Settings
-
-`indices.fielddata.cache.clean_interval` has been replaced with `indices.cache.clean_interval` and is no longer supported.
-
-==== Allocation settings
-
-Allocation settings deprecated in 1.x have been removed:
-
- * `cluster.routing.allocation.concurrent_recoveries` is superseded by `cluster.routing.allocation.node_concurrent_recoveries`
-
-Please change the setting in your configuration files or in the clusterstate to use the new settings instead.
-
-==== Similarity settings
-
-The 'default' similarity has been renamed to 'classic'.
-
-==== Indexing settings
-
-`indices.memory.min_shard_index_buffer_size` and `indices.memory.max_shard_index_buffer_size` are removed since Elasticsearch now allows any one shard to any
-amount of heap as long as the total indexing buffer heap used across all shards is below the node's `indices.memory.index_buffer_size` (default: 10% of the JVM heap)
-
-==== Removed es.max-open-files
-
-Setting the system property es.max-open-files to true to get
-Elasticsearch to print the number of maximum open files for the
-Elasticsearch process has been removed. This same information can be
-obtained from the <<cluster-nodes-info>> API, and a warning is logged
-on startup if it is set too low.
-
-==== Removed es.netty.gathering
-
-Disabling Netty from using NIO gathering could be done via the escape
-hatch of setting the system property "es.netty.gathering" to "false".
-Time has proven enabling gathering by default is a non-issue and this
-non-documented setting has been removed.
-
-==== Removed es.useLinkedTransferQueue
-
-The system property `es.useLinkedTransferQueue` could be used to
-control the queue implementation used in the cluster service and the
-handling of ping responses during discovery. This was an undocumented
-setting and has been removed.
-
-[[breaking_50_mapping_changes]]
-=== Mapping changes
-
-==== Default doc values settings
-
-Doc values are now also on by default on numeric and boolean fields that are
-not indexed.
-
-==== Transform removed
-
-The `transform` feature from mappings has been removed. It made issues very hard to debug.
-
-==== Default number mappings
-
-When a floating-point number is encountered, it is now dynamically mapped as a
-float by default instead of a double. The reasoning is that floats should be
-more than enough for most cases but would decrease storage requirements
-significantly.
-
-==== `index` property
-
-On all types but `string`, the `index` property now only accepts `true`/`false`
-instead of `not_analyzed`/`no`. The `string` field still accepts
-`analyzed`/`not_analyzed`/`no`.
-
-==== ++_source++'s `format` option
-
-The `_source` mapping does not support the `format` option anymore. This option
-will still be accepted for indices created before the upgrade to 5.0 for backward
-compatibility, but it will have no effect. Indices created on or after 5.0 will
-reject this option.
-
-==== Object notation
-
-Core types don't support the object notation anymore, which allowed to provide
-values as follows:
-
-[source,json]
----------------
-{
-  "value": "field_value",
-  "boost": 42
-}
----------------
-
-==== `fielddata.format`
-
-Setting `fielddata.format: doc_values` in the mappings used to implicitly
-enable doc values on a field. This no longer works: the only way to enable or
-disable doc values is by using the `doc_values` property of mappings.
-
-
-[[breaking_50_plugins]]
-=== Plugin changes
-
-The command `bin/plugin` has been renamed to `bin/elasticsearch-plugin`.
-The structure of the plugin has changed. All the plugin files must be contained in a directory called `elasticsearch`.
-If you use the gradle build, this structure is automatically generated.
-
-==== Site plugins removed
-
-Site plugins have been removed. It is recommended to migrate site plugins to Kibana plugins.
-
-==== Multicast plugin removed
-
-Multicast has been removed. Use unicast discovery, or one of the cloud discovery plugins.
-
-==== Plugins with custom query implementations
-
-Plugins implementing custom queries need to implement the `fromXContent(QueryParseContext)` method in their
-`QueryParser` subclass rather than `parse`. This method will take care of parsing the query from `XContent` format
-into an intermediate query representation that can be streamed between the nodes in binary format, effectively the
-query object used in the java api. Also, the query parser needs to implement the `getBuilderPrototype` method that
-returns a prototype of the `NamedWriteable` query, which allows to deserialize an incoming query by calling
-`readFrom(StreamInput)` against it, which will create a new object, see usages of `Writeable`. The `QueryParser`
-also needs to declare the generic type of the query that it supports and it's able to parse.
-The query object can then transform itself into a lucene query through the new `toQuery(QueryShardContext)` method,
-which returns a lucene query to be executed on the data node.
-
-Similarly, plugins implementing custom score functions need to implement the `fromXContent(QueryParseContext)`
-method in their `ScoreFunctionParser` subclass rather than `parse`. This method will take care of parsing
-the function from `XContent` format into an intermediate function representation that can be streamed between
-the nodes in binary format, effectively the function object used in the java api. Also, the query parser needs
-to implement the `getBuilderPrototype` method that returns a prototype of the `NamedWriteable` function, which
-allows to deserialize an incoming function by calling `readFrom(StreamInput)` against it, which will create a
-new object, see usages of `Writeable`. The `ScoreFunctionParser` also needs to declare the generic type of the
-function that it supports and it's able to parse. The function object can then transform itself into a lucene
-function through the new `toFunction(QueryShardContext)` method, which returns a lucene function to be executed
-on the data node.
-
-==== Cloud AWS plugin changes
-
-Cloud AWS plugin has been split in two plugins:
-
-* {plugins}/discovery-ec2.html[Discovery EC2 plugin]
-* {plugins}/repository-s3.html[Repository S3 plugin]
-
-Proxy settings for both plugins have been renamed:
-
-* from `cloud.aws.proxy_host` to `cloud.aws.proxy.host`
-* from `cloud.aws.ec2.proxy_host` to `cloud.aws.ec2.proxy.host`
-* from `cloud.aws.s3.proxy_host` to `cloud.aws.s3.proxy.host`
-* from `cloud.aws.proxy_port` to `cloud.aws.proxy.port`
-* from `cloud.aws.ec2.proxy_port` to `cloud.aws.ec2.proxy.port`
-* from `cloud.aws.s3.proxy_port` to `cloud.aws.s3.proxy.port`
-
-==== Cloud Azure plugin changes
-
-Cloud Azure plugin has been split in three plugins:
-
-* {plugins}/discovery-azure.html[Discovery Azure plugin]
-* {plugins}/repository-azure.html[Repository Azure plugin]
-* {plugins}/store-smb.html[Store SMB plugin]
-
-If you were using the `cloud-azure` plugin for snapshot and restore, you had in `elasticsearch.yml`:
-
-[source,yaml]
------
-cloud:
-    azure:
-        storage:
-            account: your_azure_storage_account
-            key: your_azure_storage_key
------
-
-You need to give a unique id to the storage details now as you can define multiple storage accounts:
-
-[source,yaml]
------
-cloud:
-    azure:
-        storage:
-            my_account:
-                account: your_azure_storage_account
-                key: your_azure_storage_key
------
-
-
-==== Cloud GCE plugin changes
-
-Cloud GCE plugin has been renamed to {plugins}/discovery-gce.html[Discovery GCE plugin].
-
-
-==== Mapper Attachments plugin deprecated
-
-Mapper attachments has been deprecated. Users should use now the {plugins}/ingest-attachment.html[`ingest-attachment`]
-plugin.
-
-
-[[breaking_50_java_api_changes]]
-=== Java API changes
-
-==== Count api has been removed
-
-The deprecated count api has been removed from the Java api, use the search api instead and set size to 0.
-
-The following call
-
-[source,java]
------
-client.prepareCount(indices).setQuery(query).get();
------
-
-can be replaced with
-
-[source,java]
------
-client.prepareSearch(indices).setSource(new SearchSourceBuilder().size(0).query(query)).get();
------
-
-==== BoostingQueryBuilder
-
-Removed setters for mandatory positive/negative query. Both arguments now have
-to be supplied at construction time already and have to be non-null.
-
-==== SpanContainingQueryBuilder
-
-Removed setters for mandatory big/little inner span queries. Both arguments now have
-to be supplied at construction time already and have to be non-null. Updated
-static factory methods in QueryBuilders accordingly.
-
-==== SpanOrQueryBuilder
-
-Making sure that query contains at least one clause by making initial clause mandatory
-in constructor.
-
-==== SpanNearQueryBuilder
-
-Removed setter for mandatory slop parameter, needs to be set in constructor now. Also
-making sure that query contains at least one clause by making initial clause mandatory
-in constructor. Updated the static factory methods in QueryBuilders accordingly.
-
-==== SpanNotQueryBuilder
-
-Removed setter for mandatory include/exclude span query clause, needs to be set in constructor now.
-Updated the static factory methods in QueryBuilders and tests accordingly.
-
-==== SpanWithinQueryBuilder
-
-Removed setters for mandatory big/little inner span queries. Both arguments now have
-to be supplied at construction time already and have to be non-null. Updated
-static factory methods in QueryBuilders accordingly.
-
-==== QueryFilterBuilder
-
-Removed the setter `queryName(String queryName)` since this field is not supported
-in this type of query. Use `FQueryFilterBuilder.queryName(String queryName)` instead
-when in need to wrap a named query as a filter.
-
-==== WrapperQueryBuilder
-
-Removed `wrapperQueryBuilder(byte[] source, int offset, int length)`. Instead simply
-use  `wrapperQueryBuilder(byte[] source)`. Updated the static factory methods in
-QueryBuilders accordingly.
-
-==== QueryStringQueryBuilder
-
-Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
-Use the `field(String, float)` method instead.
-
-==== Operator
-
-Removed the enums called `Operator` from `MatchQueryBuilder`, `QueryStringQueryBuilder`,
-`SimpleQueryStringBuilder`, and `CommonTermsQueryBuilder` in favour of using the enum
-defined in `org.elasticsearch.index.query.Operator` in an effort to consolidate the
-codebase and avoid duplication.
-
-==== queryName and boost support
-
-Support for `queryName` and `boost` has been streamlined to all of the queries. That is
-a breaking change till queries get sent over the network as serialized json rather
-than in `Streamable` format. In fact whenever additional fields are added to the json
-representation of the query, older nodes might throw error when they find unknown fields.
-
-==== InnerHitsBuilder
-
-InnerHitsBuilder now has a dedicated addParentChildInnerHits and addNestedInnerHits methods
-to differentiate between inner hits for nested vs. parent / child documents. This change
-makes the type / path parameter mandatory.
-
-==== MatchQueryBuilder
-
-Moving MatchQueryBuilder.Type and MatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.Type.
-Also reusing new Operator enum.
-
-==== MoreLikeThisQueryBuilder
-
-Removed `MoreLikeThisQueryBuilder.Item#id(String id)`, `Item#doc(BytesReference doc)`,
-`Item#doc(XContentBuilder doc)`. Use provided constructors instead.
-
-Removed `MoreLikeThisQueryBuilder#addLike` in favor of texts and/or items being provided
-at construction time. Using arrays there instead of lists now.
-
-Removed `MoreLikeThisQueryBuilder#addUnlike` in favor to using the `unlike` methods
-which take arrays as arguments now rather than the lists used before.
-
-The deprecated `docs(Item... docs)`, `ignoreLike(Item... docs)`,
-`ignoreLike(String... likeText)`, `addItem(Item... likeItems)` have been removed.
-
-==== GeoDistanceQueryBuilder
-
-Removing individual setters for lon() and lat() values, both values should be set together
- using point(lon, lat).
-
-==== GeoDistanceRangeQueryBuilder
-
-Removing setters for to(Object ...) and from(Object ...) in favour of the only two allowed input
-arguments (String, Number). Removing setter for center point (point(), geohash()) because parameter
-is mandatory and should already be set in constructor.
-Also removing setters for lt(), lte(), gt(), gte() since they can all be replaced by equivalent
-calls to to/from() and inludeLower()/includeUpper().
-
-==== GeoPolygonQueryBuilder
-
-Require shell of polygon already to be specified in constructor instead of adding it pointwise.
-This enables validation, but makes it necessary to remove the addPoint() methods.
-
-==== MultiMatchQueryBuilder
-
-Moving MultiMatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.ZeroTermsQuery.
-Also reusing new Operator enum.
-
-Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
-Use the `field(String, float)` method instead.
-
-==== MissingQueryBuilder
-
-The MissingQueryBuilder which was deprecated in 2.2.0 is removed. As a replacement use ExistsQueryBuilder
-inside a mustNot() clause. So instead of using `new ExistsQueryBuilder(name)` now use
-`new BoolQueryBuilder().mustNot(new ExistsQueryBuilder(name))`.
-
-==== NotQueryBuilder
-
-The NotQueryBuilder which was deprecated in 2.1.0 is removed. As a replacement use BoolQueryBuilder
-with added mustNot() clause. So instead of using `new NotQueryBuilder(filter)` now use
-`new BoolQueryBuilder().mustNot(filter)`.
-
-==== TermsQueryBuilder
-
-Remove the setter for `termsLookup()`, making it only possible to either use a TermsLookup object or
-individual values at construction time. Also moving individual settings for the TermsLookup (lookupIndex,
-lookupType, lookupId, lookupPath) to the separate TermsLookup class, using constructor only and moving
-checks for validation there. Removed `TermsLookupQueryBuilder` in favour of `TermsQueryBuilder`.
-
-==== FunctionScoreQueryBuilder
-
-`add` methods have been removed, all filters and functions must be provided as constructor arguments by
-creating an array of `FunctionScoreQueryBuilder.FilterFunctionBuilder` objects, containing one element
-for each filter/function pair.
-
-`scoreMode` and `boostMode` can only be provided using corresponding enum members instead
-of string values: see `FilterFunctionScoreQuery.ScoreMode` and `CombineFunction`.
-
-`CombineFunction.MULT` has been renamed to `MULTIPLY`.
-
-==== IdsQueryBuilder
-
-For simplicity, only one way of adding the ids to the existing list (empty by default)  is left: `addIds(String...)`
-
-==== DocumentAlreadyExistsException removed
-
-`DocumentAlreadyExistsException` is removed and a `VersionConflictException` is thrown instead (with a better
-error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()`
-to index a document only if it doesn't already exist.
-
-==== ShapeBuilders
-
-`InternalLineStringBuilder` is removed in favour of `LineStringBuilder`, `InternalPolygonBuilder` in favour of PolygonBuilder` and `Ring` has been replaced with `LineStringBuilder`. Also the abstract base classes `BaseLineStringBuilder` and `BasePolygonBuilder` haven been merged with their corresponding implementations.
-
-==== RescoreBuilder
-
-`RecoreBuilder.Rescorer` was merged with `RescoreBuilder`, which now is an abstract superclass. QueryRescoreBuilder currently is its only implementation.
-
-==== PhraseSuggestionBuilder
-
-The inner DirectCandidateGenerator class has been moved out to its own class called DirectCandidateGeneratorBuilder.
-
-==== Elasticsearch will no longer detect logging implementations
-
-Elasticsearch now logs only to log4j 1.2. Previously if log4j wasn't on the classpath it made some effort to degrade to
-slf4j or java.util.logging. Now it'll fail to work without the log4j 1.2 api. The log4j-over-slf4j bridge ought to work
-when using the java client. As should log4j 2's log4j-1.2-api. The Elasticsearch server now only supports log4j as
-configured by logging.yml and it no longer makes any effort to work if log4j isn't present.
-
-[[breaking_50_cache_concurrency]]
-=== Cache concurrency level settings removed
-
-Two cache concurrency level settings `indices.requests.cache.concurrency_level` and
-`indices.fielddata.cache.concurrency_level` because they no longer apply to the cache implementation used for the
-request cache and the field data cache.
-
-[[breaking_50_non_loopback]]
-=== Remove bind option of `non_loopback`
-
-This setting would arbitrarily pick the first interface not marked as loopback. Instead, specify by address
-scope (e.g. `_local_,_site_` for all loopback and private network addresses) or by explicit interface names,
-hostnames, or addresses.
-
-[[breaking_50_thread_pool]]
-=== Forbid changing of thread pool types
-
-Previously, <<modules-threadpool,thread pool types>> could be dynamically adjusted. The thread pool type effectively
-controls the backing queue for the thread pool and modifying this is an expert setting with minimal practical benefits
-and high risk of being misused. The ability to change the thread pool type for any thread pool has been removed; do note
-that it is still possible to adjust relevant thread pool parameters for each of the thread pools (e.g., depending on
-the thread pool type, `keep_alive`, `queue_size`, etc.).
-
-[[breaking_50_cpu_stats]]
-=== System CPU stats
-
-The recent CPU usage (as a percent) has been added to the OS stats
-reported under the node stats API and the cat nodes API. The breaking
-change here is that there is a new object in the `os` object in the node
-stats response. This object is called `cpu` and includes "percent" and
-`load_average` as fields. This moves the `load_average` field that was
-previously a top-level field in the `os` object to the `cpu` object. The
-format of the `load_average` field has changed to an object with fields
-`1m`, `5m`, and `15m` representing the one-minute, five-minute and
-fifteen-minute loads respectively. If any of these fields are not present,
-it indicates that the corresponding value is not available.
-
-In the cat nodes API response, the `cpu` field is output by default. The
-previous `load` field has been removed and is replaced by `load_1m`,
-`load_5m`, and `load_15m` which represent the one-minute, five-minute
-and fifteen-minute loads respectively. The field will be null if the
-corresponding value is not available.
-
-Finally, the API for `org.elasticsearch.monitor.os.OsStats` has
-changed. The `getLoadAverage` method has been removed. The value for
-this can now be obtained from `OsStats.Cpu#getLoadAverage` but it is no
-longer a double and is instead an object encapsulating the one-minute,
-five-minute and fifteen-minute load averages. Additionally, the recent
-CPU usage can be obtained from `OsStats.Cpu#getPercent`.
-
-=== Fields option
-Only stored fields are retrievable with this option.
-The fields option won't be able to load non stored fields from _source anymore.
-
-[[breaking_50_allocation]]
-=== Primary shard allocation
-
-Previously, primary shards were only assigned if a quorum of shard copies were found (configurable using
-`index.recovery.initial_shards`, now deprecated). In case where a primary had only a single replica, quorum was defined
-to be a single shard. This meant that any shard copy of an index with replication factor 1 could become primary, even it
-was a stale copy of the data on disk. This is now fixed by using allocation IDs.
-
-Allocation IDs assign unique identifiers to shard copies. This allows the cluster to differentiate between multiple
-copies of the same data and track which shards have been active, so that after a cluster restart, shard copies
-containing only the most recent data can become primaries.
-
-=== Indices Shard Stores command
-
-By using allocation IDs instead of version numbers to identify shard copies for primary shard allocation, the former versioning scheme
-has become obsolete. This is reflected in the indices-shards-stores.html[Indices Shard Stores API]. A new field `allocation_id` replaces the
-former `version` field in the result of the Indices Shard Stores command. This field is available for all shard copies that have been either
-created with the current version of Elasticsearch or have been active in a cluster running a current version of Elasticsearch. For legacy
-shard copies that have not been active in a current version of Elasticsearch, a `legacy_version` field is available instead (equivalent to
-the former `version` field).
-
-=== Reroute commands
-
-The reroute command `allocate` has been split into two distinct commands `allocate_replica` and `allocate_empty_primary`.
-This was done as we introduced a new `allocate_stale_primary` command. The new `allocate_replica` command corresponds to the
-old `allocate` command  with `allow_primary` set to false. The new `allocate_empty_primary` command corresponds to the old
-`allocate` command with `allow_primary` set to true.
-
-==== `index.shared_filesystem.recover_on_any_node` changes
-
-The behavior of `index.shared_filesystem.recover_on_any_node = true` has been changed. Previously, in the case where no
-shard copies could be found, an arbitrary node was chosen by potentially ignoring allocation deciders. Now, we take
-balancing into account but don't assign the shard if the allocation deciders are not satisfied. The behavior has also changed
-in the case where shard copies can be found. Previously, a node not holding the shard copy was chosen if none of the nodes
-holding shard copies were satisfying the allocation deciders. Now, the shard will be assigned to a node having a shard copy,
-even if none of the nodes holding a shard copy satisfy the allocation deciders.
-
-[[breaking_50_percolator]]
-=== Percolator
-
-Adding percolator queries and modifications to existing percolator queries are no longer visible in immediately
-to the percolator. A refresh is required to run before the changes are visible to the percolator.
-
-The reason that this has changed is that on newly created indices the percolator automatically indexes the query terms
-and these query terms are used at percolate time to reduce the amount of queries the percolate API needs evaluate.
-This optimization didn't work in the percolate API mode where modifications to queries are immediately visible.
-
-The percolator by defaults sets the `size` option to `10` whereas before this was set to unlimited.
-
-The percolate api can no longer accept documents that have fields that don't exist in the mapping.
-
-When percolating an existing document then specifying a document in the source of the percolate request is not allowed
-any more.
-
-The percolate api no longer modifies the mappings. Before the percolate api could be used to dynamically introduce new
-fields to the mappings based on the fields in the document being percolated. This no longer works, because these
-unmapped fields are not persisted in the mapping.
-
-Percolator documents are no longer excluded from the search response.
-
-[[breaking_50_packaging]]
-=== Packaging
-
-==== Default logging using systemd (since Elasticsearch 2.2.0)
-
-In previous versions of Elasticsearch, the default logging
-configuration routed standard output to /dev/null and standard error to
-the journal. However, there are often critical error messages at
-startup that are logged to standard output rather than standard error
-and these error messages would be lost to the nether. The default has
-changed to now route standard output to the journal and standard error
-to inherit this setting (these are the defaults for systemd). These
-settings can be modified by editing the elasticsearch.service file.
-
-==== Longer startup times
-
-In Elasticsearch 5.0.0 the `-XX:+AlwaysPreTouch` flag has been added to the JVM
-startup options. This option touches all memory pages used by the JVM heap
-during initialization of the HotSpot VM to reduce the chance of having to commit
-a memory page during GC time. This will increase the startup time of
-Elasticsearch as well as increasing the initial resident memory usage of the
-Java process.
+* <<breaking_50_plugins>>
 
-[[breaking_50_scripting]]
-=== Scripting
+include::migrate_5_0/search.asciidoc[]
 
-==== Script mode settings
+include::migrate_5_0/mapping.asciidoc[]
 
-Previously script mode settings (e.g., "script.inline: true",
-"script.engine.groovy.inline.aggs: false", etc.) accepted the values
-`on`, `true`, `1`, and `yes` for enabling a scripting mode, and the
-values `off`, `false`, `0`, and `no` for disabling a scripting mode.
-The variants `on`, `1`, and `yes ` for enabling and `off`, `0`,
-and `no` for disabling are no longer supported.
+include::migrate_5_0/percolator.asciidoc[]
 
-==== Groovy dependencies
+include::migrate_5_0/index-apis.asciidoc[]
 
-In previous versions of Elasticsearch, the Groovy scripting capabilities
-depended on the `org.codehaus.groovy:groovy-all` artifact.  In addition
-to pulling in the Groovy language, this pulls in a very large set of
-functionality, none of which is needed for scripting within
-Elasticsearch. Aside from the inherent difficulties in managing such a
-large set of dependencies, this also increases the surface area for
-security issues. This dependency has been reduced to the core Groovy
-language `org.codehaus.groovy:groovy` artifact.
+include::migrate_5_0/settings.asciidoc[]
 
-[[breaking_50_term_vectors]]
-=== Term vectors
+include::migrate_5_0/allocation.asciidoc[]
 
-The term vectors APIs no longer persist unmapped fields in the mappings.
+include::migrate_5_0/rest.asciidoc[]
 
-The `dfs` parameter has been removed completely, term vectors don't support
-distributed document frequencies anymore.
+include::migrate_5_0/cat.asciidoc[]
 
-[[breaking_50_security]]
-=== Security
+include::migrate_5_0/java.asciidoc[]
 
-The option to disable the security manager `--security.manager.enabled` has been removed. In order to grant special
-permissions to elasticsearch users must tweak the local Java Security Policy.
+include::migrate_5_0/packaging.asciidoc[]
 
-[[breaking_50_snapshot_restore]]
-=== Snapshot/Restore
+include::migrate_5_0/plugins.asciidoc[]
 
-==== Closing / deleting indices while running snapshot
 
-In previous versions of Elasticsearch, closing or deleting an index during a full snapshot would make the snapshot fail. This is now changed
-by failing the close/delete index request instead. The behavior for partial snapshots remains unchanged: Closing or deleting an index during
-a partial snapshot is still possible. The snapshot result is then marked as partial.

+ 54 - 0
docs/reference/migration/migrate_5_0/allocation.asciidoc

@@ -0,0 +1,54 @@
+[[breaking_50_allocation]]
+=== Allocation changes
+
+==== Primary shard allocation
+
+Previously, primary shards were only assigned if a quorum of shard copies were
+found (configurable using `index.recovery.initial_shards`, now deprecated). In
+case where a primary had only a single replica, quorum was defined to be a
+single shard. This meant that any shard copy of an index with replication
+factor 1 could become primary, even it was a stale copy of the data on disk.
+This is now fixed thanks to shard allocation IDs.
+
+Allocation IDs assign unique identifiers to shard copies. This allows the
+cluster to differentiate between multiple copies of the same data and track
+which shards have been active so that, after a cluster restart, only shard
+copies containing the most recent data can become primaries.
+
+==== Indices Shard Stores command
+
+By using allocation IDs instead of version numbers to identify shard copies
+for primary shard allocation, the former versioning scheme has become
+obsolete. This is reflected in the
+<<indices-shards-stores,Indices Shard Stores API>>.
+
+A new `allocation_id` field replaces the former `version` field in the result
+of the Indices Shard Stores command. This field is available for all shard
+copies that have been either created with the current version of Elasticsearch
+or have been active in a cluster running a current version of Elasticsearch.
+For legacy shard copies that have not been active in a current version of
+Elasticsearch, a `legacy_version` field is available instead (equivalent to
+the former `version` field).
+
+==== Reroute commands
+
+The reroute command `allocate` has been split into two distinct commands
+`allocate_replica` and `allocate_empty_primary`. This was done as we
+introduced a new `allocate_stale_primary` command. The new `allocate_replica`
+command corresponds to the old `allocate` command  with `allow_primary` set to
+false. The new `allocate_empty_primary` command corresponds to the old
+`allocate` command with `allow_primary` set to true.
+
+==== `index.shared_filesystem.recover_on_any_node` changes
+
+The behavior of `index.shared_filesystem.recover_on_any_node: true` has been
+changed. Previously, in the case where no shard copies could be found, an
+arbitrary node was chosen by potentially ignoring allocation deciders. Now, we
+take balancing into account but don't assign the shard if the allocation
+deciders are not satisfied.
+
+The behavior has also changed in the case where shard copies can be found.
+Previously, a node not holding the shard copy was chosen if none of the nodes
+holding shard copies were satisfying the allocation deciders. Now, the shard
+will be assigned to a node having a shard copy, even if none of the nodes
+holding a shard copy satisfy the allocation deciders.

+ 33 - 0
docs/reference/migration/migrate_5_0/cat.asciidoc

@@ -0,0 +1,33 @@
+[[breaking_50_cat_api]]
+=== CAT API changes
+
+==== Use Accept header for specifying response media type
+
+Previous versions of Elasticsearch accepted the Content-type header
+field for controlling the media type of the response in the cat API.
+This is in opposition to the HTTP spec which specifies the Accept
+header field for this purpose. Elasticsearch now uses the Accept header
+field and support for using the Content-Type header field for this
+purpose has been removed.
+
+==== Host field removed from the cat nodes API
+
+The `host` field has been removed from the cat nodes API as its value
+is always equal to the `ip` field. The `name` field is available in the
+cat nodes API and should be used instead of the `host` field.
+
+==== Changes to cat recovery API
+
+The fields `bytes_recovered` and `files_recovered` have been added to
+the cat recovery API. These fields, respectively, indicate the total
+number of bytes and files that have been recovered.
+
+The fields `total_files` and `total_bytes` have been renamed to
+`files_total` and `bytes_total`, respectively.
+
+Additionally, the field `translog` has been renamed to
+`translog_ops_recovered`, the field `translog_total` to
+`translog_ops` and the field `translog_percent` to
+`translog_ops_percent`. The short aliases for these fields are `tor`,
+`to`, and `top`, respectively.
+

+ 48 - 0
docs/reference/migration/migrate_5_0/index-apis.asciidoc

@@ -0,0 +1,48 @@
+[[breaking_50_index_apis]]
+=== Index APIs changes
+
+==== Closing / deleting indices while running snapshot
+
+In previous versions of Elasticsearch, closing or deleting an index during a
+full snapshot would make the snapshot fail. In 5.0, the close/delete index
+request will fail instead. The behavior for partial snapshots remains
+unchanged: Closing or deleting an index during a partial snapshot is still
+possible. The snapshot result is then marked as partial.
+
+==== Warmers
+
+Thanks to several changes like doc values by default and disk-based norms,
+warmers are no longer useful. As a consequence, warmers and the warmer API
+have been removed: it is no longer possible to register queries that will run
+before a new IndexSearcher is published.
+
+Don't worry if you have warmers defined on your indices, they will simply be
+ignored when upgrading to 5.0.
+
+==== System CPU stats
+
+The recent CPU usage (as a percent) has been added to the OS stats
+reported under the node stats API and the cat nodes API. The breaking
+change here is that there is a new object in the `os` object in the node
+stats response. This object is called `cpu` and includes percent` and
+`load_average` as fields. This moves the `load_average` field that was
+previously a top-level field in the `os` object to the `cpu` object. The
+format of the `load_average` field has changed to an object with fields
+`1m`, `5m`, and `15m` representing the one-minute, five-minute and
+fifteen-minute loads respectively. If any of these fields are not present,
+it indicates that the corresponding value is not available.
+
+In the cat nodes API response, the `cpu` field is output by default. The
+previous `load` field has been removed and is replaced by `load_1m`,
+`load_5m`, and `load_15m` which represent the one-minute, five-minute
+and fifteen-minute loads respectively. The field will be null if the
+corresponding value is not available.
+
+Finally, the API for `org.elasticsearch.monitor.os.OsStats` has
+changed. The `getLoadAverage` method has been removed. The value for
+this can now be obtained from `OsStats.Cpu#getLoadAverage` but it is no
+longer a double and is instead an object encapsulating the one-minute,
+five-minute and fifteen-minute load averages. Additionally, the recent
+CPU usage can be obtained from `OsStats.Cpu#getPercent`.
+
+

+ 213 - 0
docs/reference/migration/migrate_5_0/java.asciidoc

@@ -0,0 +1,213 @@
+
+
+
+[[breaking_50_java_api_changes]]
+=== Java API changes
+
+==== Count api has been removed
+
+The deprecated count api has been removed from the Java api, use the search api instead and set size to 0.
+
+The following call
+
+[source,java]
+-----
+client.prepareCount(indices).setQuery(query).get();
+-----
+
+can be replaced with
+
+[source,java]
+-----
+client.prepareSearch(indices).setSource(new SearchSourceBuilder().size(0).query(query)).get();
+-----
+
+==== Elasticsearch will no longer detect logging implementations
+
+Elasticsearch now logs only to log4j 1.2. Previously if log4j wasn't on the
+classpath it made some effort to degrade to slf4j or java.util.logging. Now it
+will fail to work without the log4j 1.2 api. The log4j-over-slf4j bridge ought
+to work when using the java client, as should log4j 2's log4j-1.2-api. The
+Elasticsearch server now only supports log4j as configured by `logging.yml`
+and will fail if log4j isn't present.
+
+==== Groovy dependencies
+
+In previous versions of Elasticsearch, the Groovy scripting capabilities
+depended on the `org.codehaus.groovy:groovy-all` artifact.  In addition
+to pulling in the Groovy language, this pulls in a very large set of
+functionality, none of which is needed for scripting within
+Elasticsearch. Aside from the inherent difficulties in managing such a
+large set of dependencies, this also increases the surface area for
+security issues. This dependency has been reduced to the core Groovy
+language `org.codehaus.groovy:groovy` artifact.
+
+==== DocumentAlreadyExistsException removed
+
+`DocumentAlreadyExistsException` is removed and a `VersionConflictException` is thrown instead (with a better
+error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()`
+to index a document only if it doesn't already exist.
+
+==== Changes to Query Builders
+
+===== BoostingQueryBuilder
+
+Removed setters for mandatory positive/negative query. Both arguments now have
+to be supplied at construction time already and have to be non-null.
+
+===== SpanContainingQueryBuilder
+
+Removed setters for mandatory big/little inner span queries. Both arguments now have
+to be supplied at construction time already and have to be non-null. Updated
+static factory methods in QueryBuilders accordingly.
+
+===== SpanOrQueryBuilder
+
+Making sure that query contains at least one clause by making initial clause mandatory
+in constructor.
+
+===== SpanNearQueryBuilder
+
+Removed setter for mandatory slop parameter, needs to be set in constructor now. Also
+making sure that query contains at least one clause by making initial clause mandatory
+in constructor. Updated the static factory methods in QueryBuilders accordingly.
+
+===== SpanNotQueryBuilder
+
+Removed setter for mandatory include/exclude span query clause, needs to be set in constructor now.
+Updated the static factory methods in QueryBuilders and tests accordingly.
+
+===== SpanWithinQueryBuilder
+
+Removed setters for mandatory big/little inner span queries. Both arguments now have
+to be supplied at construction time already and have to be non-null. Updated
+static factory methods in QueryBuilders accordingly.
+
+===== QueryFilterBuilder
+
+Removed the setter `queryName(String queryName)` since this field is not supported
+in this type of query. Use `FQueryFilterBuilder.queryName(String queryName)` instead
+when in need to wrap a named query as a filter.
+
+===== WrapperQueryBuilder
+
+Removed `wrapperQueryBuilder(byte[] source, int offset, int length)`. Instead simply
+use  `wrapperQueryBuilder(byte[] source)`. Updated the static factory methods in
+QueryBuilders accordingly.
+
+===== QueryStringQueryBuilder
+
+Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
+Use the `field(String, float)` method instead.
+
+===== Operator
+
+Removed the enums called `Operator` from `MatchQueryBuilder`, `QueryStringQueryBuilder`,
+`SimpleQueryStringBuilder`, and `CommonTermsQueryBuilder` in favour of using the enum
+defined in `org.elasticsearch.index.query.Operator` in an effort to consolidate the
+codebase and avoid duplication.
+
+===== queryName and boost support
+
+Support for `queryName` and `boost` has been streamlined to all of the queries. That is
+a breaking change till queries get sent over the network as serialized json rather
+than in `Streamable` format. In fact whenever additional fields are added to the json
+representation of the query, older nodes might throw error when they find unknown fields.
+
+===== InnerHitsBuilder
+
+InnerHitsBuilder now has a dedicated addParentChildInnerHits and addNestedInnerHits methods
+to differentiate between inner hits for nested vs. parent / child documents. This change
+makes the type / path parameter mandatory.
+
+===== MatchQueryBuilder
+
+Moving MatchQueryBuilder.Type and MatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.Type.
+Also reusing new Operator enum.
+
+===== MoreLikeThisQueryBuilder
+
+Removed `MoreLikeThisQueryBuilder.Item#id(String id)`, `Item#doc(BytesReference doc)`,
+`Item#doc(XContentBuilder doc)`. Use provided constructors instead.
+
+Removed `MoreLikeThisQueryBuilder#addLike` in favor of texts and/or items being provided
+at construction time. Using arrays there instead of lists now.
+
+Removed `MoreLikeThisQueryBuilder#addUnlike` in favor to using the `unlike` methods
+which take arrays as arguments now rather than the lists used before.
+
+The deprecated `docs(Item... docs)`, `ignoreLike(Item... docs)`,
+`ignoreLike(String... likeText)`, `addItem(Item... likeItems)` have been removed.
+
+===== GeoDistanceQueryBuilder
+
+Removing individual setters for lon() and lat() values, both values should be set together
+ using point(lon, lat).
+
+===== GeoDistanceRangeQueryBuilder
+
+Removing setters for to(Object ...) and from(Object ...) in favour of the only two allowed input
+arguments (String, Number). Removing setter for center point (point(), geohash()) because parameter
+is mandatory and should already be set in constructor.
+Also removing setters for lt(), lte(), gt(), gte() since they can all be replaced by equivalent
+calls to to/from() and inludeLower()/includeUpper().
+
+===== GeoPolygonQueryBuilder
+
+Require shell of polygon already to be specified in constructor instead of adding it pointwise.
+This enables validation, but makes it necessary to remove the addPoint() methods.
+
+===== MultiMatchQueryBuilder
+
+Moving MultiMatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.ZeroTermsQuery.
+Also reusing new Operator enum.
+
+Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
+Use the `field(String, float)` method instead.
+
+===== MissingQueryBuilder
+
+The MissingQueryBuilder which was deprecated in 2.2.0 is removed. As a replacement use ExistsQueryBuilder
+inside a mustNot() clause. So instead of using `new ExistsQueryBuilder(name)` now use
+`new BoolQueryBuilder().mustNot(new ExistsQueryBuilder(name))`.
+
+===== NotQueryBuilder
+
+The NotQueryBuilder which was deprecated in 2.1.0 is removed. As a replacement use BoolQueryBuilder
+with added mustNot() clause. So instead of using `new NotQueryBuilder(filter)` now use
+`new BoolQueryBuilder().mustNot(filter)`.
+
+===== TermsQueryBuilder
+
+Remove the setter for `termsLookup()`, making it only possible to either use a TermsLookup object or
+individual values at construction time. Also moving individual settings for the TermsLookup (lookupIndex,
+lookupType, lookupId, lookupPath) to the separate TermsLookup class, using constructor only and moving
+checks for validation there. Removed `TermsLookupQueryBuilder` in favour of `TermsQueryBuilder`.
+
+===== FunctionScoreQueryBuilder
+
+`add` methods have been removed, all filters and functions must be provided as constructor arguments by
+creating an array of `FunctionScoreQueryBuilder.FilterFunctionBuilder` objects, containing one element
+for each filter/function pair.
+
+`scoreMode` and `boostMode` can only be provided using corresponding enum members instead
+of string values: see `FilterFunctionScoreQuery.ScoreMode` and `CombineFunction`.
+
+`CombineFunction.MULT` has been renamed to `MULTIPLY`.
+
+===== IdsQueryBuilder
+
+For simplicity, only one way of adding the ids to the existing list (empty by default)  is left: `addIds(String...)`
+
+===== ShapeBuilders
+
+`InternalLineStringBuilder` is removed in favour of `LineStringBuilder`, `InternalPolygonBuilder` in favour of PolygonBuilder` and `Ring` has been replaced with `LineStringBuilder`. Also the abstract base classes `BaseLineStringBuilder` and `BasePolygonBuilder` haven been merged with their corresponding implementations.
+
+===== RescoreBuilder
+
+`RecoreBuilder.Rescorer` was merged with `RescoreBuilder`, which now is an abstract superclass. QueryRescoreBuilder currently is its only implementation.
+
+===== PhraseSuggestionBuilder
+
+The inner DirectCandidateGenerator class has been moved out to its own class called DirectCandidateGeneratorBuilder.
+

+ 82 - 0
docs/reference/migration/migrate_5_0/mapping.asciidoc

@@ -0,0 +1,82 @@
+[[breaking_50_mapping_changes]]
+=== Mapping changes
+
+==== `string` fields replaced by `text`/`keyword` fields
+
+The `string` field datatype has been replaced by the `text` field for full
+text analyzed content, and the `keyword` field for not-analyzed exact string
+values.  For backwards compatibility purposes, during the 5.x series:
+
+* `string` fields on pre-5.0 indices will function as before.
+* New `string` fields can be added to pre-5.0 indices as before.
+* `text` and `keyword` fields can also be added to pre-5.0 indices.
+* When adding a `string` field to a new index, the field mapping will be
+  rewritten as a `text` or `keyword` field if possible, otherwise
+  an exception will be thrown.  Certain configurations that were possible
+  with `string` fields are no longer possible with `text`/`keyword` fields
+  such as enabling `term_vectors` on a not-analyzed `keyword` field.
+
+==== `index` property
+
+On all field datatypes (except for the deprecated `string` field), the `index`
+property now only accepts `true`/`false` instead of `not_analyzed`/`no`. The
+`string` field still accepts `analyzed`/`not_analyzed`/`no`.
+
+==== Doc values on unindexed fields
+
+Previously, setting a field to `index:no` would also disable doc-values.  Now,
+doc-values are always enabled on numeric and boolean fields unless
+`doc_values` is set to `false`.
+
+==== Floating points use `float` instead of `double`
+
+When dynamically mapping a field containing a floating point number, the field
+now defaults to using `float` instead of `double`. The reasoning is that
+floats should be more than enough for most cases but would decrease storage
+requirements significantly.
+
+==== `fielddata.format`
+
+Setting `fielddata.format: doc_values` in the mappings used to implicitly
+enable doc-values on a field. This no longer works: the only way to enable or
+disable doc-values is by using the `doc_values` property of mappings.
+
+==== Source-transform removed
+
+The source `transform` feature has been removed. Instead, use an ingest pipeline
+
+==== `_parent` field no longer indexed
+
+The join between parent and child documents no longer relies on indexed fields
+and therefore from 5.0.0 onwards the `_parent` field is no longer indexed. In
+order to find documents that referrer to a specific parent id the new
+`parent_id` query can be used. The GET response and hits inside the search
+response still include the parent id under the `_parent` key.
+
+==== Source `format` option
+
+The `_source` mapping no longer supports the `format` option. It will still be
+accepted for indices created before the upgrade to 5.0 for backwards
+compatibility, but it will have no effect. Indices created on or after 5.0
+will reject this option.
+
+==== Object notation
+
+Core types no longer support the object notation, which was used to provide
+per document boosts as follows:
+
+[source,json]
+---------------
+{
+  "value": "field_value",
+  "boost": 42
+}
+---------------
+
+==== Boost accuracy for queries on `_all`
+
+Per-field boosts on the `_all` are now compressed into a single byte instead
+of the 4 bytes used previously. While this will make the index much more
+space-efficient, it also means that index time boosts will be less accurately
+encoded.
+

+ 24 - 0
docs/reference/migration/migrate_5_0/packaging.asciidoc

@@ -0,0 +1,24 @@
+[[breaking_50_packaging]]
+=== Packaging
+
+==== Default logging using systemd (since Elasticsearch 2.2.0)
+
+In previous versions of Elasticsearch, the default logging
+configuration routed standard output to /dev/null and standard error to
+the journal. However, there are often critical error messages at
+startup that are logged to standard output rather than standard error
+and these error messages would be lost to the nether. The default has
+changed to now route standard output to the journal and standard error
+to inherit this setting (these are the defaults for systemd). These
+settings can be modified by editing the elasticsearch.service file.
+
+==== Longer startup times
+
+In Elasticsearch 5.0.0 the `-XX:+AlwaysPreTouch` flag has been added to the JVM
+startup options. This option touches all memory pages used by the JVM heap
+during initialization of the HotSpot VM to reduce the chance of having to commit
+a memory page during GC time. This will increase the startup time of
+Elasticsearch as well as increasing the initial resident memory usage of the
+Java process.
+
+

+ 41 - 0
docs/reference/migration/migrate_5_0/percolator.asciidoc

@@ -0,0 +1,41 @@
+[[breaking_50_percolator]]
+=== Percolator changes
+
+==== Percolator is near-real time
+
+Previously percolators were activated in real-time, i.e. as soon as they were
+indexed.  Now, changes to the percolator query are visible in near-real time,
+as soon as the index has been refreshed. This change was required because, in
+indices created from 5.0 onwards, the terms used in a percolator query are
+automatically indexed to allow for more efficient query selection during
+percolation.
+
+==== Percolator mapping
+
+The percolate API can no longer accept documents that reference fields that
+don't already exist in the mapping.
+
+The percolate API no longer modifies the mappings. Before the percolate API
+could be used to dynamically introduce new fields to the mappings based on the
+fields in the document being percolated. This no longer works, because these
+unmapped fields are not persisted in the mapping.
+
+==== Percolator documents returned by search
+
+Documents with the `.percolate` type were previously excluded from the search
+response, unless the `.percolate` type was specified explicitly in the search
+request.  Now, percolator documents are treated in the same way as any other
+document and are returned by search requests.
+
+==== Percolator `size` default
+
+The percolator by default sets the `size` option to `10` whereas before this
+was unlimited.
+
+==== Percolate API
+
+When percolating an existing document then specifying a document in the source
+of the percolate request is not allowed any more.
+
+
+

+ 99 - 0
docs/reference/migration/migrate_5_0/plugins.asciidoc

@@ -0,0 +1,99 @@
+[[breaking_50_plugins]]
+=== Plugin changes
+
+The command `bin/plugin` has been renamed to `bin/elasticsearch-plugin`. The
+structure of the plugin ZIP archive has changed. All the plugin files must be
+contained in a top-level directory called `elasticsearch`. If you use the
+gradle build, this structure is automatically generated.
+
+==== Site plugins removed
+
+Site plugins have been removed. Site plugins should be reimplemented as Kibana
+plugins.
+
+==== Multicast plugin removed
+
+Multicast has been removed. Use unicast discovery, or one of the cloud
+discovery plugins.
+
+==== Plugins with custom query implementations
+
+Plugins implementing custom queries need to implement the `fromXContent(QueryParseContext)` method in their
+`QueryParser` subclass rather than `parse`. This method will take care of parsing the query from `XContent` format
+into an intermediate query representation that can be streamed between the nodes in binary format, effectively the
+query object used in the java api. Also, the query parser needs to implement the `getBuilderPrototype` method that
+returns a prototype of the `NamedWriteable` query, which allows to deserialize an incoming query by calling
+`readFrom(StreamInput)` against it, which will create a new object, see usages of `Writeable`. The `QueryParser`
+also needs to declare the generic type of the query that it supports and it's able to parse.
+The query object can then transform itself into a lucene query through the new `toQuery(QueryShardContext)` method,
+which returns a lucene query to be executed on the data node.
+
+Similarly, plugins implementing custom score functions need to implement the `fromXContent(QueryParseContext)`
+method in their `ScoreFunctionParser` subclass rather than `parse`. This method will take care of parsing
+the function from `XContent` format into an intermediate function representation that can be streamed between
+the nodes in binary format, effectively the function object used in the java api. Also, the query parser needs
+to implement the `getBuilderPrototype` method that returns a prototype of the `NamedWriteable` function, which
+allows to deserialize an incoming function by calling `readFrom(StreamInput)` against it, which will create a
+new object, see usages of `Writeable`. The `ScoreFunctionParser` also needs to declare the generic type of the
+function that it supports and it's able to parse. The function object can then transform itself into a lucene
+function through the new `toFunction(QueryShardContext)` method, which returns a lucene function to be executed
+on the data node.
+
+==== Cloud AWS plugin changes
+
+Cloud AWS plugin has been split in two plugins:
+
+* {plugins}/discovery-ec2.html[Discovery EC2 plugin]
+* {plugins}/repository-s3.html[Repository S3 plugin]
+
+Proxy settings for both plugins have been renamed:
+
+* from `cloud.aws.proxy_host` to `cloud.aws.proxy.host`
+* from `cloud.aws.ec2.proxy_host` to `cloud.aws.ec2.proxy.host`
+* from `cloud.aws.s3.proxy_host` to `cloud.aws.s3.proxy.host`
+* from `cloud.aws.proxy_port` to `cloud.aws.proxy.port`
+* from `cloud.aws.ec2.proxy_port` to `cloud.aws.ec2.proxy.port`
+* from `cloud.aws.s3.proxy_port` to `cloud.aws.s3.proxy.port`
+
+==== Cloud Azure plugin changes
+
+Cloud Azure plugin has been split in three plugins:
+
+* {plugins}/discovery-azure.html[Discovery Azure plugin]
+* {plugins}/repository-azure.html[Repository Azure plugin]
+* {plugins}/store-smb.html[Store SMB plugin]
+
+If you were using the `cloud-azure` plugin for snapshot and restore, you had in `elasticsearch.yml`:
+
+[source,yaml]
+-----
+cloud:
+    azure:
+        storage:
+            account: your_azure_storage_account
+            key: your_azure_storage_key
+-----
+
+You need to give a unique id to the storage details now as you can define multiple storage accounts:
+
+[source,yaml]
+-----
+cloud:
+    azure:
+        storage:
+            my_account:
+                account: your_azure_storage_account
+                key: your_azure_storage_key
+-----
+
+
+==== Cloud GCE plugin changes
+
+Cloud GCE plugin has been renamed to {plugins}/discovery-gce.html[Discovery GCE plugin].
+
+
+==== Mapper Attachments plugin deprecated
+
+Mapper attachments has been deprecated. Users should use now the {plugins}/ingest-attachment.html[`ingest-attachment`]
+plugin.
+

+ 17 - 0
docs/reference/migration/migrate_5_0/rest.asciidoc

@@ -0,0 +1,17 @@
+
+[[breaking_50_rest_api_changes]]
+=== REST API changes
+
+==== id values longer than 512 bytes are rejected
+
+When specifying an `_id` value longer than 512 bytes, the request will be
+rejected.
+
+==== `/_optimize` endpoint removed
+
+The deprecated `/_optimize` endpoint has been removed. The `/_forcemerge`
+endpoint should be used in lieu of optimize.
+
+The `GET` HTTP verb for `/_forcemerge` is no longer supported, please use the
+`POST` HTTP verb.
+

+ 141 - 0
docs/reference/migration/migrate_5_0/search.asciidoc

@@ -0,0 +1,141 @@
+[[breaking_50_search_changes]]
+=== Search and Query DSL changes
+
+==== `search_type`
+
+===== `search_type=count` removed
+
+The `count` search type was deprecated since version 2.0.0 and is now removed.
+In order to get the same benefits, you just need to set the value of the `size`
+parameter to `0`.
+
+For instance, the following request:
+
+[source,sh]
+---------------
+GET /my_index/_search?search_type=count
+{
+  "aggs": {
+    "my_terms": {
+       "terms": {
+         "field": "foo"
+       }
+     }
+  }
+}
+---------------
+
+can be replaced with:
+
+[source,sh]
+---------------
+GET /my_index/_search
+{
+  "size": 0,
+  "aggs": {
+    "my_terms": {
+       "terms": {
+         "field": "foo"
+       }
+     }
+  }
+}
+---------------
+
+===== `search_type=scan` removed
+
+The `scan` search type was deprecated since version 2.1.0 and is now removed.
+All benefits from this search type can now be achieved by doing a scroll
+request that sorts documents in `_doc` order, for instance:
+
+[source,sh]
+---------------
+GET /my_index/_search?scroll=2m
+{
+  "sort": [
+    "_doc"
+  ]
+}
+---------------
+
+Scroll requests sorted by `_doc` have been optimized to more efficiently resume
+from where the previous request stopped, so this will have the same performance
+characteristics as the former `scan` search type.
+
+==== `fields` parameter
+
+The `fields` parameter used to try to retrieve field values from stored
+fields, and fall back to extracting from the `_source` if a field is not
+marked as stored.  Now, the `fields` parameter will only return stored fields
+-- it will no longer extract values from the `_source`.
+
+==== search-exists API removed
+
+The search exists api has been removed in favour of using the search api with
+`size` set to `0` and `terminate_after` set to `1`.
+
+
+==== Deprecated queries removed
+
+The following deprecated queries have been removed:
+
+`filtered`::      Use `bool` query instead, which supports `filter` clauses too.
+`and`::           Use `must` clauses in a `bool` query instead.
+`or`::            Use `should` clauses in a `bool` query instead.
+`limit`::         Use the `terminate_after` parameter instead.
+`fquery`::        Is obsolete after filters and queries have been merged.
+`query`::         Is obsolete after filters and queries have been merged.
+`query_binary`::  Was undocumented and has been removed.
+`filter_binary`:: Was undocumented and has been removed.
+
+
+==== Changes to queries
+
+* Removed support for the deprecated `min_similarity` parameter in `fuzzy
+  query`, in favour of `fuzziness`.
+
+* Removed support for the deprecated `fuzzy_min_sim` parameter in
+  `query_string` query, in favour of `fuzziness`.
+
+* Removed support for the deprecated `edit_distance` parameter in completion
+  suggester, in favour of `fuzziness`.
+
+* Removed support for the deprecated `filter` and `no_match_filter` fields in `indices` query,
+in favour of `query` and `no_match_query`.
+
+* Removed support for the deprecated `filter` fields in `nested` query, in favour of `query`.
+
+* Removed support for the deprecated `minimum_should_match` and
+  `disable_coord` in `terms` query, use `bool` query instead. Also removed
+  support for the deprecated `execution` parameter.
+
+* Removed support for the top level `filter` element in `function_score` query, replaced by `query`.
+
+* The `collect_payloads` parameter of the `span_near` query has been deprecated.  Payloads will be loaded when needed.
+
+* The `score_type` parameter to the `has_child` and `has_parent` queries has been removed in favour of `score_mode`.
+  Also, the `sum` score mode has been removed in favour of the `total` mode.
+
+*  When the `max_children` parameter was set to `0` on the `has_child` query
+   then there was no upper limit on how many child documents were allowed to
+   match. Now, `0` really means that zero child documents are allowed. If no
+   upper limit is needed then the `max_children` parameter shouldn't be specified
+   at all.
+
+
+==== Top level `filter` parameter
+
+Removed support for the deprecated top level `filter` in the search api,
+replaced by `post_filter`.
+
+==== Highlighters
+
+Removed support for multiple highlighter names, the only supported ones are:
+`plain`, `fvh` and `postings`.
+
+==== Term vectors API
+
+The term vectors APIs no longer persist unmapped fields in the mappings.
+
+The `dfs` parameter to the term vectors API has been removed completely. Term
+vectors don't support distributed document frequencies anymore.

+ 174 - 0
docs/reference/migration/migrate_5_0/settings.asciidoc

@@ -0,0 +1,174 @@
+[[breaking_50_settings_changes]]
+=== Settings changes
+
+From Elasticsearch 5.0 on all settings are validated before they are applied.
+Node level and default index level settings are validated on node startup,
+dynamic cluster and index setting are validated before they are updated/added
+to the cluster state.
+
+Every setting must be a *known* setting. All settings must have been
+registered with the node or transport client they are used with. This implies
+that plugins that define custom settings must register all of their settings
+during plugin loading using the `SettingsModule#registerSettings(Setting)`
+method.
+
+==== Node settings
+
+The `name` setting has been removed and is replaced by `node.name`. Usage of
+`-Dname=some_node_name` is not supported anymore.
+
+==== Transport Settings
+
+All settings with a `netty` infix have been replaced by their already existing
+`transport` synonyms. For instance `transport.netty.bind_host` is no longer
+supported and should be replaced by the superseding setting
+`transport.bind_host`.
+
+==== Script mode settings
+
+Previously script mode settings (e.g., "script.inline: true",
+"script.engine.groovy.inline.aggs: false", etc.) accepted the values
+`on`, `true`, `1`, and `yes` for enabling a scripting mode, and the
+values `off`, `false`, `0`, and `no` for disabling a scripting mode.
+The variants `on`, `1`, and `yes ` for enabling and `off`, `0`,
+and `no` for disabling are no longer supported.
+
+
+==== Security manager settings
+
+The option to disable the security manager `security.manager.enabled` has been
+removed. In order to grant special permissions to elasticsearch users must
+edit the local Java Security Policy.
+
+==== Network settings
+
+The `_non_loopback_` value for settings like `network.host` would arbitrarily
+pick the first interface not marked as loopback. Instead, specify by address
+scope (e.g. `_local_,_site_` for all loopback and private network addresses)
+or by explicit interface names, hostnames, or addresses.
+
+==== Forbid changing of thread pool types
+
+Previously, <<modules-threadpool,thread pool types>> could be dynamically
+adjusted. The thread pool type effectively controls the backing queue for the
+thread pool and modifying this is an expert setting with minimal practical
+benefits and high risk of being misused. The ability to change the thread pool
+type for any thread pool has been removed. It is still possible to adjust
+relevant thread pool parameters for each of the thread pools (e.g., depending
+on the thread pool type, `keep_alive`, `queue_size`, etc.).
+
+
+==== Analysis settings
+
+The `index.analysis.analyzer.default_index` analyzer is not supported anymore.
+If you wish to change the analyzer to use for indexing, change the
+`index.analysis.analyzer.default` analyzer instead.
+
+==== Ping timeout settings
+
+Previously, there were three settings for the ping timeout:
+`discovery.zen.initial_ping_timeout`, `discovery.zen.ping.timeout` and
+`discovery.zen.ping_timeout`. The former two have been removed and the only
+setting key for the ping timeout is now `discovery.zen.ping_timeout`. The
+default value for ping timeouts remains at three seconds.
+
+==== Recovery settings
+
+Recovery settings deprecated in 1.x have been removed:
+
+ * `index.shard.recovery.translog_size` is superseded by `indices.recovery.translog_size`
+ * `index.shard.recovery.translog_ops` is superseded by `indices.recovery.translog_ops`
+ * `index.shard.recovery.file_chunk_size` is superseded by `indices.recovery.file_chunk_size`
+ * `index.shard.recovery.concurrent_streams` is superseded by `indices.recovery.concurrent_streams`
+ * `index.shard.recovery.concurrent_small_file_streams` is superseded by `indices.recovery.concurrent_small_file_streams`
+ * `indices.recovery.max_size_per_sec` is superseded by `indices.recovery.max_bytes_per_sec`
+
+If you are using any of these settings please take the time to review their
+purpose. All of the settings above are considered _expert settings_ and should
+only be used if absolutely necessary. If you have set any of the above setting
+as persistent cluster settings please use the settings update API and set
+their superseded keys accordingly.
+
+The following settings have been removed without replacement
+
+ * `indices.recovery.concurrent_small_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
+ * `indices.recovery.concurrent_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
+
+==== Translog settings
+
+The `index.translog.flush_threshold_ops` setting is not supported anymore. In
+order to control flushes based on the transaction log growth use
+`index.translog.flush_threshold_size` instead.
+
+Changing the translog type with `index.translog.fs.type` is not supported
+anymore, the `buffered` implementation is now the only available option and
+uses a fixed `8kb` buffer.
+
+The translog by default is fsynced after every `index`, `create`, `update`,
+`delete`, or `bulk` request.  The ability to fsync on every operation is not
+necessary anymore. In fact, it can be a performance bottleneck and it's trappy
+since it enabled by a special value set on `index.translog.sync_interval`.
+Now, `index.translog.sync_interval`  doesn't accept a value less than `100ms`
+which prevents fsyncing too often if async durability is enabled. The special
+value `0` is no longer supported.
+
+==== Request Cache Settings
+
+The deprecated settings `index.cache.query.enable` and
+`indices.cache.query.size` have been removed and are replaced with
+`index.requests.cache.enable` and `indices.requests.cache.size` respectively.
+
+`indices.requests.cache.clean_interval has been replaced with
+`indices.cache.clean_interval and is no longer supported.
+
+==== Field Data Cache Settings
+
+The `indices.fielddata.cache.clean_interval` setting has been replaced with
+`indices.cache.clean_interval`.
+
+==== Allocation settings
+
+The `cluster.routing.allocation.concurrent_recoveries` setting has been
+replaced with `cluster.routing.allocation.node_concurrent_recoveries`.
+
+==== Similarity settings
+
+The 'default' similarity has been renamed to 'classic'.
+
+==== Indexing settings
+
+The `indices.memory.min_shard_index_buffer_size` and
+`indices.memory.max_shard_index_buffer_size` have been removed as
+Elasticsearch now allows any one shard to use  amount of heap as long as the
+total indexing buffer heap used across all shards is below the node's
+`indices.memory.index_buffer_size` (defaults to 10% of the JVM heap).
+
+==== Removed es.max-open-files
+
+Setting the system property es.max-open-files to true to get
+Elasticsearch to print the number of maximum open files for the
+Elasticsearch process has been removed. This same information can be
+obtained from the <<cluster-nodes-info>> API, and a warning is logged
+on startup if it is set too low.
+
+==== Removed es.netty.gathering
+
+Disabling Netty from using NIO gathering could be done via the escape
+hatch of setting the system property "es.netty.gathering" to "false".
+Time has proven enabling gathering by default is a non-issue and this
+non-documented setting has been removed.
+
+==== Removed es.useLinkedTransferQueue
+
+The system property `es.useLinkedTransferQueue` could be used to
+control the queue implementation used in the cluster service and the
+handling of ping responses during discovery. This was an undocumented
+setting and has been removed.
+
+==== Cache concurrency level settings removed
+
+Two cache concurrency level settings
+`indices.requests.cache.concurrency_level` and
+`indices.fielddata.cache.concurrency_level` because they no longer apply to
+the cache implementation used for the request cache and the field data cache.
+