Browse Source

Add a note about multi-valued doc values in the docs. (#134107)

Martijn van Groningen 1 month ago
parent
commit
275d049b6f
1 changed files with 7 additions and 0 deletions
  1. 7 0
      docs/reference/elasticsearch/mapping-reference/doc-values.md

+ 7 - 0
docs/reference/elasticsearch/mapping-reference/doc-values.md

@@ -73,6 +73,13 @@ PUT my-index-000001
 1. The `status_code` field has `doc_values` enabled by default.
 2. The `session_id` has `doc_values` disabled, but can still be queried.
 
+## Multi-valued doc values note
+
+Elasticsearch supports storing multi-valued fields at index time. Multi-valued fields can be provided as a json array. However in the doc values format, the values aren't stored in the order as was provided at index time. Additionally, duplicates may be lost.
+This implementation detail of doc values is visible when features directly interact with doc values, which may be the case for example in ES|QL or aggregations in the search API. Note, that _source always returns arrays in the way that was provided at index time.
+
+How the ordering differs depends on whether the array is mapped as keyword or a numeric field type. In case of the `keyword` field type, the multi-valued values for each document are ordered lexicographically and duplicates are lost. If retaining duplicates is important then the `counted_keyword` field type should be used.
+In case of numeric field types (e.g. `long`, `double`, `scaled_float`, etc.), the multi-valued values for each document are ordered in natural order and duplicates are retained.