123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393 |
- === Mapping changes
- A number of changes have been made to mappings to remove ambiguity and to
- ensure that conflicting mappings cannot be created.
- One major change is that dynamically added fields must have their mapping
- confirmed by the master node before indexing continues. This is to avoid a
- problem where different shards in the same index dynamically add different
- mappings for the same field. These conflicting mappings can silently return
- incorrect results and can lead to index corruption.
- This change can make indexing slower when frequently adding many new fields.
- We are looking at ways of optimising this process but we chose safety over
- performance for this extreme use case.
- ==== Conflicting field mappings
- Fields with the same name, in the same index, in different types, must have
- the same mapping, with the exception of the <<copy-to>>, <<dynamic>>,
- <<enabled>>, <<ignore-above>>, <<include-in-all>>, and <<properties>>
- parameters, which may have different settings per field.
- [source,js]
- ---------------
- PUT my_index
- {
- "mappings": {
- "type_one": {
- "properties": {
- "name": { <1>
- "type": "string"
- }
- }
- },
- "type_two": {
- "properties": {
- "name": { <1>
- "type": "string",
- "analyzer": "english"
- }
- }
- }
- }
- }
- ---------------
- <1> The two `name` fields have conflicting mappings and will prevent Elasticsearch
- from starting.
- Elasticsearch will not start in the presence of conflicting field mappings.
- These indices must be deleted or reindexed using a new mapping.
- The `ignore_conflicts` option of the put mappings API has been removed.
- Conflicts can't be ignored anymore.
- ==== Fields cannot be referenced by short name
- A field can no longer be referenced using its short name. Instead, the full
- path to the field is required. For instance:
- [source,js]
- ---------------
- PUT my_index
- {
- "mappings": {
- "my_type": {
- "properties": {
- "title": { "type": "string" }, <1>
- "name": {
- "properties": {
- "title": { "type": "string" }, <2>
- "first": { "type": "string" },
- "last": { "type": "string" }
- }
- }
- }
- }
- }
- }
- ---------------
- <1> This field is referred to as `title`.
- <2> This field is referred to as `name.title`.
- Previously, the two `title` fields in the example above could have been
- confused with each other when using the short name `title`.
- ==== Type name prefix removed
- Previously, two fields with the same name in two different types could
- sometimes be disambiguated by prepending the type name. As a side effect, it
- would add a filter on the type name to the relevant query. This feature was
- ambiguous -- a type name could be confused with a field name -- and didn't
- work everywhere e.g. aggregations.
- Instead, fields should be specified with the full path, but without a type
- name prefix. If you wish to filter by the `_type` field, either specify the
- type in the URL or add an explicit filter.
- The following example query in 1.x:
- [source,js]
- ----------------------------
- GET my_index/_search
- {
- "query": {
- "match": {
- "my_type.some_field": "quick brown fox"
- }
- }
- }
- ----------------------------
- would be rewritten in 2.0 as:
- [source,js]
- ----------------------------
- GET my_index/my_type/_search <1>
- {
- "query": {
- "match": {
- "some_field": "quick brown fox" <2>
- }
- }
- }
- ----------------------------
- <1> The type name can be specified in the URL to act as a filter.
- <2> The field name should be specified without the type prefix.
- ==== Field names may not contain dots
- In 1.x, it was possible to create fields with dots in their name, for
- instance:
- [source,js]
- ----------------------------
- PUT my_index
- {
- "mappings": {
- "my_type": {
- "properties": {
- "foo.bar": { <1>
- "type": "string"
- },
- "foo": {
- "properties": {
- "bar": { <1>
- "type": "string"
- }
- }
- }
- }
- }
- }
- }
- ----------------------------
- <1> These two fields cannot be distinguised as both are referred to as `foo.bar`.
- You can no longer create fields with dots in the name.
- ==== Type names may not start with a dot
- In 1.x, Elasticsearch would issue a warning if a type name included a dot,
- e.g. `my.type`. Now that type names are no longer used to distinguish between
- fields in differnt types, this warning has been relaxed: type names may now
- contain dots, but they may not *begin* with a dot. The only exception to this
- is the special `.percolator` type.
- ==== Types may no longer be deleted
- In 1.x it was possible to delete a type mapping, along with all of the
- documents of that type, using the delete mapping API. This is no longer
- supported, because remnants of the fields in the type could remain in the
- index, causing corruption later on.
- Instead, if you need to delete a type mapping, you should reindex to a new
- index which does not contain the mapping. If you just need to delete the
- documents that belong to that type, then use the delete-by-query plugin
- instead.
- [[migration-meta-fields]]
- ==== Type meta-fields
- The <<mapping-fields,meta-fields>> associated with had configuration options
- removed, to make them more reliable:
- * `_id` configuration can no longer be changed. If you need to sort, use the <<mapping-uid-field,`_uid`>> field instead.
- * `_type` configuration can no longer be changed.
- * `_index` configuration can no longer be changed.
- * `_routing` configuration is limited to marking routing as required.
- * `_field_names` configuration is limited to disabling the field.
- * `_size` configuration is limited to enabling the field.
- * `_timestamp` configuration is limited to enabling the field, setting format and default value.
- * `_boost` has been removed.
- * `_analyzer` has been removed.
- Importantly, *meta-fields can no longer be specified as part of the document
- body.* Instead, they must be specified in the query string parameters. For
- instance, in 1.x, the `routing` could be specified as follows:
- [source,json]
- -----------------------------
- PUT my_index
- {
- "mappings": {
- "my_type": {
- "_routing": {
- "path": "group" <1>
- },
- "properties": {
- "group": { <1>
- "type": "string"
- }
- }
- }
- }
- }
- PUT my_index/my_type/1 <2>
- {
- "group": "foo"
- }
- -----------------------------
- <1> This 1.x mapping tells Elasticsearch to extract the `routing` value from the `group` field in the document body.
- <2> This indexing request uses a `routing` value of `foo`.
- In 2.0, the routing must be specified explicitly:
- [source,json]
- -----------------------------
- PUT my_index
- {
- "mappings": {
- "my_type": {
- "_routing": {
- "required": true <1>
- },
- "properties": {
- "group": {
- "type": "string"
- }
- }
- }
- }
- }
- PUT my_index/my_type/1?routing=bar <2>
- {
- "group": "foo"
- }
- -----------------------------
- <1> Routing can be marked as required to ensure it is not forgotten during indexing.
- <2> This indexing request uses a `routing` value of `bar`.
- ==== Analyzer mappings
- Previously, `index_analyzer` and `search_analyzer` could be set separately,
- while the `analyzer` setting would set both. The `index_analyzer` setting has
- been removed in favour of just using the `analyzer` setting.
- If just the `analyzer` is set, it will be used at index time and at search time. To use a different analyzer at search time, specify both the `analyzer` and a `search_analyzer`.
- The `index_analyzer`, `search_analyzer`, and `analyzer` type-level settings
- have also been removed, as is is no longer possible to select fields based on
- the type name.
- The `_analyzer` meta-field, which allowed setting an analyzer per document has
- also been removed. It will be ignored on older indices.
- ==== Date fields and Unix timestamps
- Previously, `date` fields would first try to parse values as a Unix timestamp
- -- milliseconds-since-the-epoch -- before trying to use their defined date
- `format`. This meant that formats like `yyyyMMdd` could never work, as values
- would be interpreted as timestamps.
- In 2.0, we have added two formats: `epoch_millis` and `epoch_second`. Only
- date fields that use these formats will be able to parse timestamps.
- These formats cannot be used in dynamic templates, because they are
- indistinguishable from long values.
- ==== Default date format
- The default date format has changed from `date_optional_time` to
- `strict_date_optional_time`, which expects a 4 digit year, and a 2 digit month
- and day, (and optionally, 2 digit hour, minute, and second).
- A dynamically added date field, by default, includes the `epoch_millis`
- format to support timestamp parsing. For instance:
- [source,js]
- -------------------------
- PUT my_index/my_type/1
- {
- "date_one": "2015-01-01" <1>
- }
- -------------------------
- <1> Has `format`: `"strict_date_optional_time||epoch_millis"`.
- [[migration-bool-fields]]
- ==== Boolean fields
- Boolean fields used to have a string fielddata with `F` meaning `false` and `T`
- meaning `true`. They have been refactored to use numeric fielddata, with `0`
- for `false` and `1` for `true`. As a consequence, the format of the responses of
- the following APIs changed when applied to boolean fields: `0`/`1` is returned
- instead of `F`/`T`:
- * <<search-request-fielddata-fields,fielddata fields>>
- * <<search-request-sort,sort values>>
- * <<search-aggregations-bucket-terms-aggregation,terms aggregations>>
- In addition, terms aggregations use a custom formatter for boolean (like for
- dates and ip addresses, which are also backed by numbers) in order to return
- the user-friendly representation of boolean fields: `false`/`true`:
- [source,js]
- ---------------
- "buckets": [
- {
- "key": 0,
- "key_as_string": "false",
- "doc_count": 42
- },
- {
- "key": 1,
- "key_as_string": "true",
- "doc_count": 12
- }
- ]
- ---------------
- ==== `index_name` and `path` removed
- The `index_name` setting was used to change the name of the Lucene field,
- and the `path` setting was used on `object` fields to determine whether the
- Lucene field should use the full path (including parent object fields), or
- just the final `name`.
- These setting have been removed as their purpose is better served with the
- <<copy-to>> parameter.
- ==== Murmur3 Fields
- Fields of type `murmur3` can no longer change `doc_values` or `index` setting.
- They are always mapped as follows:
- [source,js]
- ---------------------
- {
- "type": "murmur3",
- "index": "no",
- "doc_values": true
- }
- ---------------------
- ==== Mappings in config files not supported
- The ability to specify mappings in configuration files has been removed. To
- specify default mappings that apply to multiple indexes, use
- <<indices-templates,index templates>> instead.
- Along with this change, the following settings have ben removed:
- * `index.mapper.default_mapping_location`
- * `index.mapper.default_percolator_mapping_location`
- ==== Posting and doc-values codecs
- It is no longer possible to specify per-field postings and doc values formats
- in the mappings. This setting will be ignored on indices created before 2.0
- and will cause mapping parsing to fail on indices created on or after 2.0. For
- old indices, this means that new segments will be written with the default
- postings and doc values formats of the current codec.
- It is still possible to change the whole codec by using the `index.codec`
- setting. Please however note that using a non-default codec is discouraged as
- it could prevent future versions of Elasticsearch from being able to read the
- index.
- ==== Compress and compress threshold
- The `compress` and `compress_threshold` options have been removed from the
- `_source` field and fields of type `binary`. These fields are compressed by
- default. If you would like to increase compression levels, use the new
- <<index-codec,`index.codec: best_compression`>> setting instead.
- ==== position_offset_gap
- The default `position_offset_gap` is now 100. Indexes created in Elasticsearch
- 2.0.0 will default to using 100 and indexes created before that will continue
- to use the old default of 0. This was done to prevent phrase queries from
- matching across different values of the same term unexpectedly. Specifically,
- 100 was chosen to cause phrase queries with slops up to 99 to match only within
- a single value of a field.
|