esql-lookup-join.asciidoc 6.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183
  1. === LOOKUP JOIN
  2. ++++
  3. <titleabbrev>Correlate data with LOOKUP JOIN</titleabbrev>
  4. ++++
  5. The {esql} <<esql-lookup-join,LOOKUP join>>
  6. processing command combines data from your {esql} query results
  7. table with matching records from a specified lookup index. It adds
  8. fields from the lookup index as new columns to your results table based
  9. on matching values in the join field.
  10. Teams often have data scattered across multiple indices – like logs,
  11. IPs, user IDs, hosts, employees etc. Without a direct way to enrich or
  12. correlate each event with reference data, root-cause analysis, security
  13. checks, and operational insights become time-consuming.
  14. For example, you can use `LOOKUP JOIN` to:
  15. * Retrieve environment or ownership details for each host to correlate
  16. your metrics data.
  17. * Quickly see if any source IPs match known malicious addresses.
  18. * Tag logs with the owning team or escalation info for faster triage and
  19. incident response.
  20. <<esql-lookup-join,LOOKUP join>> is similar to <<esql-enrich-data,ENRICH>>
  21. in the fact that they both help you join data together. You should use
  22. `LOOKUP JOIN` when:
  23. * Your enrichment data changes frequently
  24. * You want to avoid index-time processing
  25. * You want SQL-like behavior, so that multiple matches result in multiple rows
  26. * You need to match on any field in a lookup index
  27. * You use document or field level security
  28. * You want to restrict users to use only specific lookup indices
  29. * You do not need to match using ranges or spatial relations
  30. [discrete]
  31. [[esql-how-lookup-join-works]]
  32. ==== How the `LOOKUP JOIN` command works
  33. The `LOOKUP JOIN` command adds new columns to a table, with data from
  34. {es} indices.
  35. image::images/esql/esql-lookup-join.png[align="center"]
  36. [[esql-lookup-join-lookup-index]]
  37. lookup_index::
  38. The name of the lookup index. This must
  39. be a specific index name - wildcards, aliases, and remote cluster
  40. references are not supported.
  41. [[esql-lookup-join-field-name]]
  42. field_name::
  43. The field to join on. This field must exist
  44. in both your current query results and in the lookup index. If the field
  45. contains multi-valued entries, those entries will not match anything
  46. (the added fields will contain `null` for those rows).
  47. [discrete]
  48. [[esql-lookup-join-example]]
  49. ==== Example
  50. `LOOKUP JOIN` has left-join behavior. If no rows match in the lookup index, `LOOKUP JOIN` retains the incoming row and adds nulls. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
  51. In this example, we have two sample tables:
  52. *employees*
  53. [cols=",,,,,",options="header",]
  54. |===
  55. |birth++_++date |emp++_++no |first++_++name |gender |hire++_++date
  56. |language
  57. |1955-10-04T00:00:00Z |10091 |Amabile |M |1992-11-18T00:00:00Z |3
  58. |1964-10-18T00:00:00Z |10092 |Valdiodio |F |1989-09-22T00:00:00Z |1
  59. |1964-06-11T00:00:00Z |10093 |Sailaja |M |1996-11-05T00:00:00Z |3
  60. |1957-05-25T00:00:00Z |10094 |Arumugam |F |1987-04-18T00:00:00Z |5
  61. |1965-01-03T00:00:00Z |10095 |Hilari |M |1986-07-15T00:00:00Z |4
  62. |===
  63. *languages++_++non++_++unique++_++key*
  64. [cols=",,",options="header",]
  65. |===
  66. |language++_++code |language++_++name |country
  67. |1 |English |Canada
  68. |1 |English |
  69. |1 | |United Kingdom
  70. |1 |English |United States of America
  71. |2 |German |++[++Germany{vbar}Austria++]++
  72. |2 |German |Switzerland
  73. |2 |German |
  74. |4 |Quenya |
  75. |5 | |Atlantis
  76. |++[++6{vbar}7++]++ |Mv-Lang |Mv-Land
  77. |++[++7{vbar}8++]++ |Mv-Lang2 |Mv-Land2
  78. |Null-Lang |Null-Land |
  79. |Null-Lang2 |Null-Land2 |
  80. |===
  81. Running the following query would provide the results shown below.
  82. [source,esql]
  83. ----
  84. FROM employees
  85. | EVAL language_code = emp_no % 10
  86. | LOOKUP JOIN languages_lookup_non_unique_key ON language_code
  87. | WHERE emp_no > 10090 AND emp_no < 10096
  88. | SORT emp_no, country
  89. | KEEP emp_no, language_code, language_name, country;
  90. ----
  91. [cols=",,,",options="header",]
  92. |===
  93. |emp++_++no |language++_++code |language++_++name |country
  94. |10091 |1 |English |Canada
  95. |10091 |1 |null |United Kingdom
  96. |10091 |1 |English |United States of America
  97. |10091 |1 |English |null
  98. |10092 |2 |German |++[++Germany, Austria++]++
  99. |10092 |2 |German |Switzerland
  100. |10092 |2 |German |null
  101. |10093 |3 |null |null
  102. |10094 |4 |Spanish |null
  103. |10095 |5 |null |France
  104. |===
  105. [IMPORTANT]
  106. ====
  107. `LOOKUP JOIN` does not guarantee the output to be in
  108. any particular order. If a certain order is required, users should use a
  109. <<esql-sort,`SORT`>> somewhere after the `LOOKUP JOIN`.
  110. ====
  111. [discrete]
  112. [[esql-lookup-join-prereqs]]
  113. ==== Prerequisites
  114. To use `LOOKUP JOIN`, the following requirements must be met:
  115. * *Compatible data types*: The join key and join field in the lookup
  116. index must have compatible data types. This means:
  117. ** The data types must either be identical or be internally represented
  118. as the same type in {esql}
  119. ** Numeric types follow these compatibility rules:
  120. *** `short` and `byte` are compatible with `integer` (all represented as
  121. `int`)
  122. *** `float`, `half_float`, and `scaled_float` are compatible
  123. with `double` (all represented as `double`)
  124. ** For text fields: You can only use text fields as the join key on the
  125. left-hand side of the join and only if they have a `.keyword` subfield
  126. To obtain a join key with a compatible type, use a
  127. <<esql-type-conversion-functions,conversion function>> if needed.
  128. For a complete list of supported data types and their internal
  129. representations, see the <<esql-supported-types,Supported Field Types documentation>>.
  130. [discrete]
  131. [[esql-lookup-join-limitations]]
  132. ==== Limitations
  133. The following are the current limitations with `LOOKUP JOIN`
  134. * Indices in <<index-mode-setting,lookup>> mode are always single-sharded.
  135. * Cross cluster search is unsupported initially. Both source and lookup indices
  136. must be local.
  137. * Currently, only matching on equality is supported.
  138. * `LOOKUP JOIN` can only use a single match field and a single index.
  139. Wildcards, aliases, datemath, and datastreams are not supported.
  140. * The name of the match field in
  141. `LOOKUP JOIN lu++_++idx ON match++_++field` must match an existing field
  142. in the query. This may require renames or evals to achieve.
  143. * The query will circuit break if there are too many matching documents
  144. in the lookup index, or if the documents are too large. More precisely,
  145. `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large
  146. amount of heap space is needed if the matching documents from the lookup
  147. index for a batch are multiple megabytes or larger. This is roughly the
  148. same as for `ENRICH`.