| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128 | [[synthetic-source]]==== Synthetic `_source`Though very handy to have around, the source field takes up a significant amountof space on disk. Instead of storing source documents on disk exactly as yousend them, Elasticsearch can reconstruct source content on the fly upon retrieval.Enable this by setting `mode: synthetic` in `_source`:[source,console,id=enable-synthetic-source-example]----PUT idx{  "mappings": {    "_source": {      "mode": "synthetic"    }  }}----// TESTSETUPWhile this on the fly reconstruction is *generally* slower than saving the sourcedocuments verbatim and loading them at query time, it saves a lot of storagespace. There are a couple of restrictions to be aware of:* When you retrieve synthetic `_source` content it undergoes minor<<synthetic-source-modifications,modifications>> compared to the original JSON.* Synthetic `_source` can be used with indices that contain only these fieldtypes:** <<aggregate-metric-double-synthetic-source, `aggregate_metric_double`>>** <<boolean-synthetic-source,`boolean`>>** <<numeric-synthetic-source,`byte`>>** <<date-synthetic-source,`date`>>** <<date-nanos-synthetic-source,`date_nanos`>>** <<dense-vector-synthetic-source,`dense_vector`>>** <<numeric-synthetic-source,`double`>>** <<numeric-synthetic-source,`float`>>** <<geo-point-synthetic-source,`geo_point`>>** <<numeric-synthetic-source,`half_float`>>** <<histogram-synthetic-source,`histogram`>>** <<numeric-synthetic-source,`integer`>>** <<ip-synthetic-source,`ip`>>** <<keyword-synthetic-source,`keyword`>>** <<numeric-synthetic-source,`long`>>** <<numeric-synthetic-source,`scaled_float`>>** <<numeric-synthetic-source,`short`>>** <<text-synthetic-source,`text`>> (with a `keyword` sub-field)** <<version-synthetic-source,`version`>>Runtime fields cannot, at this stage, use synthetic `_source`.[[synthetic-source-modifications]]===== Synthetic source modificationsWhen synthetic `_source` is enabled, retrieved documents undergo somemodifications compared to the original JSON.[[synthetic-source-modifications-leaf-arrays]]====== Arrays moved to leaf fieldsSynthetic `_source` arrays are moved to leaves. For example:[source,console,id=synthetic-source-leaf-arrays-example]----PUT idx/_doc/1{  "foo": [    {      "bar": 1    },    {      "bar": 2    }  ]}----// TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]Will become:[source,console-result]----{  "foo": {    "bar": [1, 2]  }}----// TEST[s/^/{"_source":/ s/\n$/}/][[synthetic-source-modifications-field-names]]====== Fields named as they are mappedSynthetic source names fields as they are named in the mapping. When usedwith <<dynamic,dynamic mapping>>, fields with dots (`.`) in their names are, bydefault, interpreted as multiple objects, while dots in field names arepreserved within objects that have <<subobjects>> disabled. For example:[source,console,id=synthetic-source-objecty-example]----PUT idx/_doc/1{  "foo.bar.baz": 1}----// TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]Will become:[source,console-result]----{  "foo": {    "bar": {      "baz": 1    }  }}----// TEST[s/^/{"_source":/ s/\n$/}/][[synthetic-source-modifications-alphabetical]]====== Alphabetical sortingSynthetic `_source` fields are sorted alphabetically. Thehttps://www.rfc-editor.org/rfc/rfc7159.html[JSON RFC] defines objects as"an unordered collection of zero or more name/value pairs" so applicationsshouldn't care but without synthetic `_source` the original ordering ispreserved and some applications may, counter to the spec, do something withthat ordering.
 |