123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795 |
- [[search-fields]]
- == Retrieve selected fields from a search
- ++++
- <titleabbrev>Retrieve selected fields</titleabbrev>
- ++++
- By default, each hit in the search response includes the document
- <<mapping-source-field,`_source`>>, which is the entire JSON object that was
- provided when indexing the document. There are two recommended methods to
- retrieve selected fields from a search query:
- * Use the <<search-fields-param,`fields` option>> to extract the values of
- fields present in the index mapping
- * Use the <<source-filtering,`_source` option>> if you need to access the original data that was passed at index time
- You can use both of these methods, though the `fields` option is preferred
- because it consults both the document data and index mappings. In some
- instances, you might want to use <<field-retrieval-methods,other methods>> of
- retrieving data.
- [discrete]
- [[search-fields-param]]
- === The `fields` option
- To retrieve specific fields in the search response, use the `fields` parameter.
- // tag::fields-param-desc[]
- Because it consults the index mappings, the `fields` parameter provides several
- advantages over referencing the `_source` directly. Specifically, the `fields`
- parameter:
- * Returns each value in a standardized way that matches its mapping type
- * Accepts <<multi-fields,multi-fields>> and <<field-alias,field aliases>>
- * Formats dates and spatial data types
- * Retrieves <<runtime-retrieving-fields,runtime field values>>
- * Returns fields calculated by a script at index time
- * Returns fields from related indices using <<lookup-runtime-fields, lookup runtime fields>>
- // end::fields-param-desc[]
- Other mapping options are also respected, including
- <<ignore-above,`ignore_above`>>, <<ignore-malformed,`ignore_malformed`>>, and
- <<null-value,`null_value`>>.
- The `fields` option returns values in the way that matches how {es} indexes
- them. For standard fields, this means that the `fields` option looks in
- `_source` to find the values, then parses and formats them using the mappings.
- [discrete]
- [[search-fields-request]]
- ==== Retrieve specific fields
- The following search request uses the `fields` parameter to retrieve values
- for the `user.id` field, all fields starting with `http.response.`, and the
- `@timestamp` field.
- Using object notation, you can pass a <<search-api-fields,`format`>> argument to
- customize the format of returned date or geospatial values.
- [source,console]
- ----
- POST my-index-000001/_search
- {
- "query": {
- "match": {
- "user.id": "kimchy"
- }
- },
- "fields": [
- "user.id",
- "http.response.*", <1>
- {
- "field": "@timestamp",
- "format": "epoch_millis" <2>
- }
- ],
- "_source": false
- }
- ----
- // TEST[setup:my_index]
- // TEST[s/_search/_search\?filter_path=hits/]
- // tag::fields-param-callouts[]
- <1> Both full field names and wildcard patterns are accepted.
- <2> Use the `format` parameter to apply a custom format for the field's values.
- // end::fields-param-callouts[]
- NOTE: By default, document metadata fields like `_id` or `_index` are not
- returned when the requested `fields` option uses wildcard patterns like `*`.
- However, when explicitly requested using the field name, the `_id`, `_routing`,
- `_ignored`, `_index` and `_version` metadata fields can be retrieved.
- [discrete]
- [[search-fields-response]]
- ==== Response always returns an array
- The `fields` response always returns an array of values for each field,
- even when there is a single value in the `_source`. This is because {es} has
- no dedicated array type, and any field could contain multiple values. The
- `fields` parameter also does not guarantee that array values are returned in
- a specific order. See the mapping documentation on <<array,arrays>> for more
- background.
- The response includes values as a flat list in the `fields` section for each
- hit. Because the `fields` parameter doesn't fetch entire objects, only leaf
- fields are returned.
- [source,console-result]
- ----
- {
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 1.0,
- "hits" : [
- {
- "_index" : "my-index-000001",
- "_id" : "0",
- "_score" : 1.0,
- "fields" : {
- "user.id" : [
- "kimchy"
- ],
- "@timestamp" : [
- "4098435132000"
- ],
- "http.response.bytes": [
- 1070000
- ],
- "http.response.status_code": [
- 200
- ]
- }
- }
- ]
- }
- }
- ----
- // TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
- // TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
- [discrete]
- [[search-fields-nested]]
- ==== Retrieve nested fields
- [%collapsible]
- ====
- The `fields` response for <<nested,`nested` fields>> is slightly different from that
- of regular object fields. While leaf values inside regular `object` fields are
- returned as a flat list, values inside `nested` fields are grouped to maintain the
- independence of each object inside the original nested array.
- For each entry inside a nested field array, values are again returned as a flat list
- unless there are other `nested` fields inside the parent nested object, in which case
- the same procedure is repeated again for the deeper nested fields.
- Given the following mapping where `user` is a nested field, after indexing
- the following document and retrieving all fields under the `user` field:
- [source,console]
- --------------------------------------------------
- PUT my-index-000001
- {
- "mappings": {
- "properties": {
- "group" : { "type" : "keyword" },
- "user": {
- "type": "nested",
- "properties": {
- "first" : { "type" : "keyword" },
- "last" : { "type" : "keyword" }
- }
- }
- }
- }
- }
- PUT my-index-000001/_doc/1?refresh=true
- {
- "group" : "fans",
- "user" : [
- {
- "first" : "John",
- "last" : "Smith"
- },
- {
- "first" : "Alice",
- "last" : "White"
- }
- ]
- }
- POST my-index-000001/_search
- {
- "fields": ["*"],
- "_source": false
- }
- --------------------------------------------------
- The response will group `first` and `last` name instead of
- returning them as a flat list.
- [source,console-result]
- ----
- {
- "took": 2,
- "timed_out": false,
- "_shards": {
- "total": 1,
- "successful": 1,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": {
- "value": 1,
- "relation": "eq"
- },
- "max_score": 1.0,
- "hits": [{
- "_index": "my-index-000001",
- "_id": "1",
- "_score": 1.0,
- "fields": {
- "group" : ["fans"],
- "user": [{
- "first": ["John"],
- "last": ["Smith"]
- },
- {
- "first": ["Alice"],
- "last": ["White"]
- }
- ]
- }
- }]
- }
- }
- ----
- // TESTRESPONSE[s/"took": 2/"took": $body.took/]
- // TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
- // TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
- Nested fields will be grouped by their nested paths, no matter the pattern used
- to retrieve them. For example, if you query only for the `user.first` field from
- the previous example:
- [source,console]
- --------------------------------------------------
- POST my-index-000001/_search
- {
- "fields": ["user.first"],
- "_source": false
- }
- --------------------------------------------------
- // TEST[continued]
- The response returns only the user's first name, but still maintains the
- structure of the nested `user` array:
- [source,console-result]
- ----
- {
- "took": 2,
- "timed_out": false,
- "_shards": {
- "total": 1,
- "successful": 1,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": {
- "value": 1,
- "relation": "eq"
- },
- "max_score": 1.0,
- "hits": [{
- "_index": "my-index-000001",
- "_id": "1",
- "_score": 1.0,
- "fields": {
- "user": [{
- "first": ["John"]
- },
- {
- "first": ["Alice"]
- }
- ]
- }
- }]
- }
- }
- ----
- // TESTRESPONSE[s/"took": 2/"took": $body.took/]
- // TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
- // TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
- However, when the `fields` pattern targets the nested `user` field directly, no
- values will be returned because the pattern doesn't match any leaf fields.
- ====
- [discrete]
- [[retrieve-unmapped-fields]]
- ==== Retrieve unmapped fields
- [%collapsible]
- ====
- By default, the `fields` parameter returns only values of mapped fields.
- However, {es} allows storing fields in `_source` that are unmapped, such as
- setting <<dynamic-field-mapping,dynamic field mapping>> to `false` or by using
- an object field with `enabled: false`. These options disable parsing and
- indexing of the object content.
- To retrieve unmapped fields in an object from `_source`, use the
- `include_unmapped` option in the `fields` section:
- [source,console]
- ----
- PUT my-index-000001
- {
- "mappings": {
- "enabled": false <1>
- }
- }
- PUT my-index-000001/_doc/1?refresh=true
- {
- "user_id": "kimchy",
- "session_data": {
- "object": {
- "some_field": "some_value"
- }
- }
- }
- POST my-index-000001/_search
- {
- "fields": [
- "user_id",
- {
- "field": "session_data.object.*",
- "include_unmapped" : true <2>
- }
- ],
- "_source": false
- }
- ----
- <1> Disable all mappings.
- <2> Include unmapped fields matching this field pattern.
- The response will contain field results under the `session_data.object.*` path,
- even if the fields are unmapped. The `user_id` field is also unmapped, but it
- won't be included in the response because `include_unmapped` isn't set to
- `true` for that field pattern.
- [source,console-result]
- ----
- {
- "took" : 2,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 1.0,
- "hits" : [
- {
- "_index" : "my-index-000001",
- "_id" : "1",
- "_score" : 1.0,
- "fields" : {
- "session_data.object.some_field": [
- "some_value"
- ]
- }
- }
- ]
- }
- }
- ----
- // TESTRESPONSE[s/"took" : 2/"took": $body.took/]
- // TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
- // TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
- ====
- [discrete]
- [[ignored-field-values]]
- ==== Ignored field values
- [%collapsible]
- ====
- The `fields` section of the response only returns values that were valid when indexed.
- If your search request asks for values from a field that ignored certain values
- because they were malformed or too large these values are returned
- separately in an `ignored_field_values` section.
- In this example we index a document that has a value which is ignored and
- not added to the index so is shown separately in search results:
- [source,console]
- ----
- PUT my-index-000001
- {
- "mappings": {
- "properties": {
- "my-small" : { "type" : "keyword", "ignore_above": 2 }, <1>
- "my-large" : { "type" : "keyword" }
- }
- }
- }
- PUT my-index-000001/_doc/1?refresh=true
- {
- "my-small": ["ok", "bad"], <2>
- "my-large": "ok content"
- }
- POST my-index-000001/_search
- {
- "fields": ["my-*"],
- "_source": false
- }
- ----
- <1> This field has a size restriction
- <2> This document field has a value that exceeds the size restriction so is ignored and not indexed
- The response will contain ignored field values under the `ignored_field_values` path.
- These values are retrieved from the document's original JSON source and are raw so will
- not be formatted or treated in any way, unlike the successfully indexed fields which are
- returned in the `fields` section.
- [source,console-result]
- ----
- {
- "took" : 2,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 1.0,
- "hits" : [
- {
- "_index" : "my-index-000001",
- "_id" : "1",
- "_score" : 1.0,
- "_ignored" : [ "my-small"],
- "fields" : {
- "my-large": [
- "ok content"
- ],
- "my-small": [
- "ok"
- ]
- },
- "ignored_field_values" : {
- "my-small": [
- "bad"
- ]
- }
- }
- ]
- }
- }
- ----
- // TESTRESPONSE[s/"took" : 2/"took": $body.took/]
- // TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
- // TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
- ====
- [discrete]
- [[source-filtering]]
- === The `_source` option
- You can use the `_source` parameter to select what fields of the source are
- returned. This is called _source filtering_.
- The following search API request sets the `_source` request body parameter to
- `false`. The document source is not included in the response.
- [source,console]
- ----
- GET /_search
- {
- "_source": false,
- "query": {
- "match": {
- "user.id": "kimchy"
- }
- }
- }
- ----
- To return only a subset of source fields, specify a wildcard (`*`) pattern in
- the `_source` parameter. The following search API request returns the source for
- only the `obj` field and its properties.
- [source,console]
- ----
- GET /_search
- {
- "_source": "obj.*",
- "query": {
- "match": {
- "user.id": "kimchy"
- }
- }
- }
- ----
- You can also specify an array of wildcard patterns in the `_source` field. The
- following search API request returns the source for only the `obj1` and
- `obj2` fields and their properties.
- [source,console]
- ----
- GET /_search
- {
- "_source": [ "obj1.*", "obj2.*" ],
- "query": {
- "match": {
- "user.id": "kimchy"
- }
- }
- }
- ----
- For finer control, you can specify an object containing arrays of `includes` and
- `excludes` patterns in the `_source` parameter.
- If the `includes` property is specified, only source fields that match one of
- its patterns are returned. You can exclude fields from this subset using the
- `excludes` property.
- If the `includes` property is not specified, the entire document source is
- returned, excluding any fields that match a pattern in the `excludes` property.
- The following search API request returns the source for only the `obj1` and
- `obj2` fields and their properties, excluding any child `description` fields.
- [source,console]
- ----
- GET /_search
- {
- "_source": {
- "includes": [ "obj1.*", "obj2.*" ],
- "excludes": [ "*.description" ]
- },
- "query": {
- "term": {
- "user.id": "kimchy"
- }
- }
- }
- ----
- [discrete]
- [[field-retrieval-methods]]
- === Other methods of retrieving data
- .Using `fields` is typically better
- ****
- These options are usually not required. Using the `fields` option is typically
- the better choice, unless you absolutely need to force loading a stored or
- `docvalue_fields`.
- ****
- A document's `_source` is stored as a single field in Lucene. This structure
- means that the whole `_source` object must be loaded and parsed even if you're
- only requesting part of it. To avoid this limitation, you can try other options
- for loading fields:
- * Use the <<docvalue-fields,`docvalue_fields`>>
- parameter to get values for selected fields. This can be a good
- choice when returning a fairly small number of fields that support doc values,
- such as keywords and dates.
- * Use the <<request-body-search-stored-fields, `stored_fields`>> parameter to
- get the values for specific stored fields (fields that use the
- <<mapping-store,`store`>> mapping option).
- {es} always attempts to load values from `_source`. This behavior has the same
- implications of source filtering where {es} needs to load and parse the entire
- `_source` to retrieve just one field.
- [discrete]
- [[docvalue-fields]]
- ==== Doc value fields
- You can use the <<docvalue-fields,`docvalue_fields`>> parameter to return
- <<doc-values,doc values>> for one or more fields in the search response.
- Doc values store the same values as the `_source` but in an on-disk,
- column-based structure that's optimized for sorting and aggregations. Since each
- field is stored separately, {es} only reads the field values that were requested
- and can avoid loading the whole document `_source`.
- Doc values are stored for supported fields by default. However, doc values are
- not supported for <<text,`text`>> or
- {plugins}/mapper-annotated-text-usage.html[`text_annotated`] fields.
- The following search request uses the `docvalue_fields` parameter to retrieve
- doc values for the `user.id` field, all fields starting with `http.response.`, and the
- `@timestamp` field:
- [source,console]
- ----
- GET my-index-000001/_search
- {
- "query": {
- "match": {
- "user.id": "kimchy"
- }
- },
- "docvalue_fields": [
- "user.id",
- "http.response.*", <1>
- {
- "field": "date",
- "format": "epoch_millis" <2>
- }
- ]
- }
- ----
- // TEST[setup:my_index]
- <1> Both full field names and wildcard patterns are accepted.
- <2> Using object notation, you can pass a `format` parameter to apply a custom
- format for the field's doc values. <<date,Date fields>> support a
- <<mapping-date-format,date `format`>>. <<number,Numeric fields>> support a
- https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat
- pattern]. Other field datatypes do not support the `format` parameter.
- TIP: You cannot use the `docvalue_fields` parameter to retrieve doc values for
- nested objects. If you specify a nested object, the search returns an empty
- array (`[ ]`) for the field. To access nested fields, use the
- <<inner-hits, `inner_hits`>> parameter's `docvalue_fields`
- property.
- [discrete]
- [[stored-fields]]
- ==== Stored fields
- It's also possible to store an individual field's values by using the
- <<mapping-store,`store`>> mapping option. You can use the
- `stored_fields` parameter to include these stored values in the search response.
- WARNING: The `stored_fields` parameter is for fields that are explicitly marked as
- stored in the mapping, which is off by default and generally not recommended.
- Use <<source-filtering,source filtering>> instead to select
- subsets of the original source document to be returned.
- Allows to selectively load specific stored fields for each document represented
- by a search hit.
- [source,console]
- --------------------------------------------------
- GET /_search
- {
- "stored_fields" : ["user", "postDate"],
- "query" : {
- "term" : { "user" : "kimchy" }
- }
- }
- --------------------------------------------------
- `*` can be used to load all stored fields from the document.
- An empty array will cause only the `_id` and `_type` for each hit to be
- returned, for example:
- [source,console]
- --------------------------------------------------
- GET /_search
- {
- "stored_fields" : [],
- "query" : {
- "term" : { "user" : "kimchy" }
- }
- }
- --------------------------------------------------
- If the requested fields are not stored (`store` mapping set to `false`), they will be ignored.
- Stored field values fetched from the document itself are always returned as an array. On the contrary, metadata fields like `_routing` are never returned as an array.
- Also only leaf fields can be returned via the `stored_fields` option. If an object field is specified, it will be ignored.
- NOTE: On its own, `stored_fields` cannot be used to load fields in nested
- objects -- if a field contains a nested object in its path, then no data will
- be returned for that stored field. To access nested fields, `stored_fields`
- must be used within an <<inner-hits, `inner_hits`>> block.
- [discrete]
- [[disable-stored-fields]]
- ===== Disable stored fields
- To disable the stored fields (and metadata fields) entirely use: `_none_`:
- [source,console]
- --------------------------------------------------
- GET /_search
- {
- "stored_fields": "_none_",
- "query" : {
- "term" : { "user" : "kimchy" }
- }
- }
- --------------------------------------------------
- NOTE: <<source-filtering,`_source`>> and <<request-body-search-version, `version`>> parameters cannot be activated if `_none_` is used.
- [discrete]
- [[script-fields]]
- ==== Script fields
- You can use the `script_fields` parameter to retrieve a <<modules-scripting,script
- evaluation>> (based on different fields) for each hit. For example:
- [source,console]
- --------------------------------------------------
- GET /_search
- {
- "query": {
- "match_all": {}
- },
- "script_fields": {
- "test1": {
- "script": {
- "lang": "painless",
- "source": "doc['price'].value * 2"
- }
- },
- "test2": {
- "script": {
- "lang": "painless",
- "source": "doc['price'].value * params.factor",
- "params": {
- "factor": 2.0
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:sales]
- Script fields can work on fields that are not stored (`price` in
- the above case), and allow to return custom values to be returned (the
- evaluated value of the script).
- Script fields can also access the actual `_source` document and
- extract specific elements to be returned from it by using `params['_source']`.
- Here is an example:
- [source,console]
- --------------------------------------------------
- GET /_search
- {
- "query": {
- "match_all": {}
- },
- "script_fields": {
- "test1": {
- "script": "params['_source']['message']"
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:my_index]
- Note the `_source` keyword here to navigate the json-like model.
- It's important to understand the difference between
- `doc['my_field'].value` and `params['_source']['my_field']`. The first,
- using the doc keyword, will cause the terms for that field to be loaded to
- memory (cached), which will result in faster execution, but more memory
- consumption. Also, the `doc[...]` notation only allows for simple valued
- fields (you can't return a json object from it) and makes sense only for
- non-analyzed or single term based fields. However, using `doc` is
- still the recommended way to access values from the document, if at all
- possible, because `_source` must be loaded and parsed every time it's used.
- Using `_source` is very slow.
|