synthetic-source.asciidoc 5.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
  1. [[synthetic-source]]
  2. ==== Synthetic `_source`
  3. IMPORTANT: Synthetic `_source` is Generally Available only for TSDB indices
  4. (indices that have `index.mode` set to `time_series`). For other indices
  5. synthetic `_source` is in technical preview. Features in technical preview may
  6. be changed or removed in a future release. Elastic will work to fix
  7. any issues, but features in technical preview are not subject to the support SLA
  8. of official GA features.
  9. Though very handy to have around, the source field takes up a significant amount
  10. of space on disk. Instead of storing source documents on disk exactly as you
  11. send them, Elasticsearch can reconstruct source content on the fly upon retrieval.
  12. Enable this by setting `mode: synthetic` in `_source`:
  13. [source,console,id=enable-synthetic-source-example]
  14. ----
  15. PUT idx
  16. {
  17. "mappings": {
  18. "_source": {
  19. "mode": "synthetic"
  20. }
  21. }
  22. }
  23. ----
  24. // TESTSETUP
  25. While this on the fly reconstruction is *generally* slower than saving the source
  26. documents verbatim and loading them at query time, it saves a lot of storage
  27. space.
  28. [[synthetic-source-restrictions]]
  29. ===== Synthetic `_source` restrictions
  30. There are a couple of restrictions to be aware of:
  31. * When you retrieve synthetic `_source` content it undergoes minor
  32. <<synthetic-source-modifications,modifications>> compared to the original JSON.
  33. * Synthetic `_source` can be used with indices that contain only these field
  34. types:
  35. ** <<aggregate-metric-double-synthetic-source, `aggregate_metric_double`>>
  36. ** {plugins}/mapper-annotated-text-usage.html#annotated-text-synthetic-source[`annotated-text`]
  37. ** <<binary-synthetic-source,`binary`>>
  38. ** <<boolean-synthetic-source,`boolean`>>
  39. ** <<numeric-synthetic-source,`byte`>>
  40. ** <<date-synthetic-source,`date`>>
  41. ** <<date-nanos-synthetic-source,`date_nanos`>>
  42. ** <<dense-vector-synthetic-source,`dense_vector`>>
  43. ** <<numeric-synthetic-source,`double`>>
  44. ** <<flattened-synthetic-source, `flattened`>>
  45. ** <<numeric-synthetic-source,`float`>>
  46. ** <<geo-point-synthetic-source,`geo_point`>>
  47. ** <<geo-shape-synthetic-source,`geo_shape`>>
  48. ** <<numeric-synthetic-source,`half_float`>>
  49. ** <<histogram-synthetic-source,`histogram`>>
  50. ** <<numeric-synthetic-source,`integer`>>
  51. ** <<ip-synthetic-source,`ip`>>
  52. ** <<keyword-synthetic-source,`keyword`>>
  53. ** <<numeric-synthetic-source,`long`>>
  54. ** <<range-synthetic-source,`range` types>>
  55. ** <<numeric-synthetic-source,`scaled_float`>>
  56. ** <<search-as-you-type-synthetic-source,`search_as_you_type`>>
  57. ** <<numeric-synthetic-source,`short`>>
  58. ** <<text-synthetic-source,`text`>>
  59. ** <<version-synthetic-source,`version`>>
  60. ** <<wildcard-synthetic-source,`wildcard`>>
  61. [[synthetic-source-modifications]]
  62. ===== Synthetic `_source` modifications
  63. When synthetic `_source` is enabled, retrieved documents undergo some
  64. modifications compared to the original JSON.
  65. [[synthetic-source-modifications-leaf-arrays]]
  66. ====== Arrays moved to leaf fields
  67. Synthetic `_source` arrays are moved to leaves. For example:
  68. [source,console,id=synthetic-source-leaf-arrays-example]
  69. ----
  70. PUT idx/_doc/1
  71. {
  72. "foo": [
  73. {
  74. "bar": 1
  75. },
  76. {
  77. "bar": 2
  78. }
  79. ]
  80. }
  81. ----
  82. // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]
  83. Will become:
  84. [source,console-result]
  85. ----
  86. {
  87. "foo": {
  88. "bar": [1, 2]
  89. }
  90. }
  91. ----
  92. // TEST[s/^/{"_source":/ s/\n$/}/]
  93. This can cause some arrays to vanish:
  94. [source,console,id=synthetic-source-leaf-arrays-example-sneaky]
  95. ----
  96. PUT idx/_doc/1
  97. {
  98. "foo": [
  99. {
  100. "bar": 1
  101. },
  102. {
  103. "baz": 2
  104. }
  105. ]
  106. }
  107. ----
  108. // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]
  109. Will become:
  110. [source,console-result]
  111. ----
  112. {
  113. "foo": {
  114. "bar": 1,
  115. "baz": 2
  116. }
  117. }
  118. ----
  119. // TEST[s/^/{"_source":/ s/\n$/}/]
  120. [[synthetic-source-modifications-field-names]]
  121. ====== Fields named as they are mapped
  122. Synthetic source names fields as they are named in the mapping. When used
  123. with <<dynamic,dynamic mapping>>, fields with dots (`.`) in their names are, by
  124. default, interpreted as multiple objects, while dots in field names are
  125. preserved within objects that have <<subobjects>> disabled. For example:
  126. [source,console,id=synthetic-source-objecty-example]
  127. ----
  128. PUT idx/_doc/1
  129. {
  130. "foo.bar.baz": 1
  131. }
  132. ----
  133. // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]
  134. Will become:
  135. [source,console-result]
  136. ----
  137. {
  138. "foo": {
  139. "bar": {
  140. "baz": 1
  141. }
  142. }
  143. }
  144. ----
  145. // TEST[s/^/{"_source":/ s/\n$/}/]
  146. [[synthetic-source-modifications-alphabetical]]
  147. ====== Alphabetical sorting
  148. Synthetic `_source` fields are sorted alphabetically. The
  149. https://www.rfc-editor.org/rfc/rfc7159.html[JSON RFC] defines objects as
  150. "an unordered collection of zero or more name/value pairs" so applications
  151. shouldn't care but without synthetic `_source` the original ordering is
  152. preserved and some applications may, counter to the spec, do something with
  153. that ordering.
  154. [[synthetic-source-modifications-ranges]]
  155. ====== Representation of ranges
  156. Range field vales (e.g. `long_range`) are always represented as inclusive on both sides with bounds adjusted accordingly. See <<range-synthetic-source-inclusive, examples>>.