multivalued-fields.asciidoc 5.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314
  1. [[esql-multivalued-fields]]
  2. === {esql} multivalued fields
  3. ++++
  4. <titleabbrev>Multivalued fields</titleabbrev>
  5. ++++
  6. {esql} is fine reading from multivalued fields:
  7. [source,console,id=esql-multivalued-fields-reorders]
  8. ----
  9. POST /mv/_bulk?refresh
  10. { "index" : {} }
  11. { "a": 1, "b": [2, 1] }
  12. { "index" : {} }
  13. { "a": 2, "b": 3 }
  14. POST /_query
  15. {
  16. "query": "FROM mv | LIMIT 2"
  17. }
  18. ----
  19. Multivalued fields come back as a JSON array:
  20. [source,console-result]
  21. ----
  22. {
  23. "took": 28,
  24. "is_partial": false,
  25. "documents_found": 2,
  26. "values_loaded": 5,
  27. "columns": [
  28. { "name": "a", "type": "long"},
  29. { "name": "b", "type": "long"}
  30. ],
  31. "values": [
  32. [1, [1, 2]],
  33. [2, 3]
  34. ]
  35. }
  36. ----
  37. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  38. The relative order of values in a multivalued field is undefined. They'll frequently be in
  39. ascending order but don't rely on that.
  40. [discrete]
  41. [[esql-multivalued-fields-dups]]
  42. ==== Duplicate values
  43. Some field types, like <<keyword-field-type,`keyword`>> remove duplicate values on write:
  44. [source,console,id=esql-multivalued-fields-kwdups]
  45. ----
  46. PUT /mv
  47. {
  48. "mappings": {
  49. "properties": {
  50. "b": {"type": "keyword"}
  51. }
  52. }
  53. }
  54. POST /mv/_bulk?refresh
  55. { "index" : {} }
  56. { "a": 1, "b": ["foo", "foo", "bar"] }
  57. { "index" : {} }
  58. { "a": 2, "b": ["bar", "bar"] }
  59. POST /_query
  60. {
  61. "query": "FROM mv | LIMIT 2"
  62. }
  63. ----
  64. And {esql} sees that removal:
  65. [source,console-result]
  66. ----
  67. {
  68. "took": 28,
  69. "is_partial": false,
  70. "documents_found": 2,
  71. "values_loaded": 5,
  72. "columns": [
  73. { "name": "a", "type": "long"},
  74. { "name": "b", "type": "keyword"}
  75. ],
  76. "values": [
  77. [1, ["bar", "foo"]],
  78. [2, "bar"]
  79. ]
  80. }
  81. ----
  82. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  83. But other types, like `long` don't remove duplicates.
  84. [source,console,id=esql-multivalued-fields-longdups]
  85. ----
  86. PUT /mv
  87. {
  88. "mappings": {
  89. "properties": {
  90. "b": {"type": "long"}
  91. }
  92. }
  93. }
  94. POST /mv/_bulk?refresh
  95. { "index" : {} }
  96. { "a": 1, "b": [2, 2, 1] }
  97. { "index" : {} }
  98. { "a": 2, "b": [1, 1] }
  99. POST /_query
  100. {
  101. "query": "FROM mv | LIMIT 2"
  102. }
  103. ----
  104. And {esql} also sees that:
  105. [source,console-result]
  106. ----
  107. {
  108. "took": 28,
  109. "is_partial": false,
  110. "documents_found": 2,
  111. "values_loaded": 7,
  112. "columns": [
  113. { "name": "a", "type": "long"},
  114. { "name": "b", "type": "long"}
  115. ],
  116. "values": [
  117. [1, [1, 2, 2]],
  118. [2, [1, 1]]
  119. ]
  120. }
  121. ----
  122. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  123. This is all at the storage layer. If you store duplicate `long`s and then
  124. convert them to strings the duplicates will stay:
  125. [source,console,id=esql-multivalued-fields-longdups-tostring]
  126. ----
  127. PUT /mv
  128. {
  129. "mappings": {
  130. "properties": {
  131. "b": {"type": "long"}
  132. }
  133. }
  134. }
  135. POST /mv/_bulk?refresh
  136. { "index" : {} }
  137. { "a": 1, "b": [2, 2, 1] }
  138. { "index" : {} }
  139. { "a": 2, "b": [1, 1] }
  140. POST /_query
  141. {
  142. "query": "FROM mv | EVAL b=TO_STRING(b) | LIMIT 2"
  143. }
  144. ----
  145. [source,console-result]
  146. ----
  147. {
  148. "took": 28,
  149. "is_partial": false,
  150. "documents_found": 2,
  151. "values_loaded": 7,
  152. "columns": [
  153. { "name": "a", "type": "long"},
  154. { "name": "b", "type": "keyword"}
  155. ],
  156. "values": [
  157. [1, ["1", "2", "2"]],
  158. [2, ["1", "1"]]
  159. ]
  160. }
  161. ----
  162. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  163. [discrete]
  164. [[esql-multivalued-nulls]]
  165. ==== `null` in a list
  166. `null` values in a list are not preserved at the storage layer:
  167. [source,console,id=esql-multivalued-fields-multivalued-nulls]
  168. ----
  169. POST /mv/_doc?refresh
  170. { "a": [2, null, 1] }
  171. POST /_query
  172. {
  173. "query": "FROM mv | LIMIT 1"
  174. }
  175. ----
  176. [source,console-result]
  177. ----
  178. {
  179. "took": 28,
  180. "is_partial": false,
  181. "documents_found": 1,
  182. "values_loaded": 2,
  183. "columns": [
  184. { "name": "a", "type": "long"},
  185. ],
  186. "values": [
  187. [[1, 2]],
  188. ]
  189. }
  190. ----
  191. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  192. [discrete]
  193. [[esql-multivalued-fields-functions]]
  194. ==== Functions
  195. Unless otherwise documented functions will return `null` when applied to a multivalued
  196. field.
  197. [source,console,id=esql-multivalued-fields-mv-into-null]
  198. ----
  199. POST /mv/_bulk?refresh
  200. { "index" : {} }
  201. { "a": 1, "b": [2, 1] }
  202. { "index" : {} }
  203. { "a": 2, "b": 3 }
  204. ----
  205. [source,console]
  206. ----
  207. POST /_query
  208. {
  209. "query": "FROM mv | EVAL b + 2, a + b | LIMIT 4"
  210. }
  211. ----
  212. // TEST[continued]
  213. // TEST[warning:Line 1:16: evaluation of [b + 2] failed, treating result as null. Only first 20 failures recorded.]
  214. // TEST[warning:Line 1:16: java.lang.IllegalArgumentException: single-value function encountered multi-value]
  215. // TEST[warning:Line 1:23: evaluation of [a + b] failed, treating result as null. Only first 20 failures recorded.]
  216. // TEST[warning:Line 1:23: java.lang.IllegalArgumentException: single-value function encountered multi-value]
  217. [source,console-result]
  218. ----
  219. {
  220. "took": 28,
  221. "is_partial": false,
  222. "documents_found": 2,
  223. "values_loaded": 5,
  224. "columns": [
  225. { "name": "a", "type": "long"},
  226. { "name": "b", "type": "long"},
  227. { "name": "b + 2", "type": "long"},
  228. { "name": "a + b", "type": "long"}
  229. ],
  230. "values": [
  231. [1, [1, 2], null, null],
  232. [2, 3, 5, 5]
  233. ]
  234. }
  235. ----
  236. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]
  237. Work around this limitation by converting the field to single value with one of:
  238. * <<esql-mv_avg>>
  239. * <<esql-mv_concat>>
  240. * <<esql-mv_count>>
  241. * <<esql-mv_max>>
  242. * <<esql-mv_median>>
  243. * <<esql-mv_min>>
  244. * <<esql-mv_sum>>
  245. [source,console,esql-multivalued-fields-mv-min]
  246. ----
  247. POST /_query
  248. {
  249. "query": "FROM mv | EVAL b=MV_MIN(b) | EVAL b + 2, a + b | LIMIT 4"
  250. }
  251. ----
  252. // TEST[continued]
  253. [source,console-result]
  254. ----
  255. {
  256. "took": 28,
  257. "is_partial": false,
  258. "documents_found": 2,
  259. "values_loaded": 5,
  260. "columns": [
  261. { "name": "a", "type": "long"},
  262. { "name": "b", "type": "long"},
  263. { "name": "b + 2", "type": "long"},
  264. { "name": "a + b", "type": "long"}
  265. ],
  266. "values": [
  267. [1, 1, 3, 2],
  268. [2, 3, 5, 5]
  269. ]
  270. }
  271. ----
  272. // TESTRESPONSE[s/"took": 28/"took": "$body.took"/]