123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129 |
- [[mapping-source-field]]
- === `_source` field
- The `_source` field contains the original JSON document body that was passed
- at index time. The `_source` field itself is not indexed (and thus is not
- searchable), but it is stored so that it can be returned when executing
- _fetch_ requests, like <<docs-get,get>> or <<search-search,search>>.
- [[disable-source-field]]
- ==== Disabling the `_source` field
- Though very handy to have around, the source field does incur storage overhead
- within the index. For this reason, it can be disabled as follows:
- [source,console]
- --------------------------------------------------
- PUT tweets
- {
- "mappings": {
- "_source": {
- "enabled": false
- }
- }
- }
- --------------------------------------------------
- [WARNING]
- .Think before disabling the `_source` field
- ==================================================
- Users often disable the `_source` field without thinking about the
- consequences, and then live to regret it. If the `_source` field isn't
- available then a number of features are not supported:
- * The <<docs-update,`update`>>, <<docs-update-by-query,`update_by_query`>>,
- and <<docs-reindex,`reindex`>> APIs.
- * On the fly <<request-body-search-highlighting,highlighting>>.
- * The ability to reindex from one Elasticsearch index to another, either
- to change mappings or analysis, or to upgrade an index to a new major
- version.
- * The ability to debug queries or aggregations by viewing the original
- document used at index time.
- * Potentially in the future, the ability to repair index corruption
- automatically.
- ==================================================
- TIP: If disk space is a concern, rather increase the
- <<index-codec,compression level>> instead of disabling the `_source`.
- .The metrics use case
- **************************************************
- The _metrics_ use case is distinct from other time-based or logging use cases
- in that there are many small documents which consist only of numbers, dates,
- or keywords. There are no updates, no highlighting requests, and the data
- ages quickly so there is no need to reindex. Search requests typically use
- simple queries to filter the dataset by date or tags, and the results are
- returned as aggregations.
- In this case, disabling the `_source` field will save space and reduce I/O.
- **************************************************
- [[include-exclude]]
- ==== Including / Excluding fields from `_source`
- An expert-only feature is the ability to prune the contents of the `_source`
- field after the document has been indexed, but before the `_source` field is
- stored.
- WARNING: Removing fields from the `_source` has similar downsides to disabling
- `_source`, especially the fact that you cannot reindex documents from one
- Elasticsearch index to another. Consider using
- <<source-filtering,source filtering>> instead.
- The `includes`/`excludes` parameters (which also accept wildcards) can be used
- as follows:
- [source,console]
- --------------------------------------------------
- PUT logs
- {
- "mappings": {
- "_source": {
- "includes": [
- "*.count",
- "meta.*"
- ],
- "excludes": [
- "meta.description",
- "meta.other.*"
- ]
- }
- }
- }
- PUT logs/_doc/1
- {
- "requests": {
- "count": 10,
- "foo": "bar" <1>
- },
- "meta": {
- "name": "Some metric",
- "description": "Some metric description", <1>
- "other": {
- "foo": "one", <1>
- "baz": "two" <1>
- }
- }
- }
- GET logs/_search
- {
- "query": {
- "match": {
- "meta.other.foo": "one" <2>
- }
- }
- }
- --------------------------------------------------
- <1> These fields will be removed from the stored `_source` field.
- <2> We can still search on this field, even though it is not in the stored `_source`.
|