enrich.asciidoc 7.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[ingest-enriching-data]]
  4. == Enrich your data
  5. You can use the <<enrich-processor,enrich processor>>
  6. to append data from existing indices
  7. to incoming documents during ingest.
  8. For example, you can use the enrich processor to:
  9. * Identify web services or vendors based on known IP addresses
  10. * Add product information to retail orders based on product IDs
  11. * Supplement contact information based on an email address
  12. * Add postal codes based on user coordinates
  13. [float]
  14. [[enrich-setup]]
  15. === Set up an enrich processor
  16. To set up an enrich processor and learn how it works,
  17. follow these steps:
  18. . Check the <<enrich-prereqs, prerequisites>>.
  19. . <<create-enrich-source-index>>.
  20. . <<create-enrich-policy>>.
  21. . <<execute-enrich-policy>>.
  22. . <<add-enrich-processor>>.
  23. . <<ingest-enrich-docs>>.
  24. Once you have an enrich processor set up,
  25. you can <<update-enrich-data,update your enrich data>>
  26. and <<update-enrich-policies, update your enrich policies>>
  27. using the <<enrich-apis,enrich APIs>>.
  28. [IMPORTANT]
  29. ====
  30. The enrich processor performs several operations
  31. and may impact the speed of your <<pipeline,ingest pipeline>>.
  32. We strongly recommend testing and benchmarking your enrich processors
  33. before deploying them in production.
  34. We do not recommend using the enrich processor to append real-time data.
  35. The enrich processor works best with reference data
  36. that doesn't change frequently.
  37. ====
  38. [float]
  39. [[enrich-prereqs]]
  40. ==== Prerequisites
  41. include::{docdir}/ingest/apis/enrich/put-enrich-policy.asciidoc[tag=enrich-policy-api-prereqs]
  42. [float]
  43. [[create-enrich-source-index]]
  44. ==== Create a source index
  45. To begin,
  46. create one or more source indices.
  47. A _source index_ contains data you want to append to incoming documents.
  48. You can index and manage documents in a source index
  49. like a regular index.
  50. The following <<docs-index_,index API>> request creates the `users` source index
  51. containing user data.
  52. This request also indexes a new document to the `users` source index.
  53. [source,console]
  54. ----
  55. PUT /users/_doc/1?refresh=wait_for
  56. {
  57. "email": "mardy.brown@asciidocsmith.com",
  58. "first_name": "Mardy",
  59. "last_name": "Brown",
  60. "city": "New Orleans",
  61. "county": "Orleans",
  62. "state": "LA",
  63. "zip": 70116,
  64. "web": "mardy.asciidocsmith.com"
  65. }
  66. ----
  67. You also can set up {beats-ref}/getting-started.html[{beats}],
  68. such as a {filebeat-ref}/filebeat-getting-started.html[{filebeat}],
  69. to automatically send and index documents
  70. to your source indices.
  71. See {beats-ref}/getting-started.html[Getting started with {beats}].
  72. [float]
  73. [[create-enrich-policy]]
  74. ==== Create an enrich policy
  75. Use the <<put-enrich-policy-api,put enrich policy API>>
  76. to create an enrich policy.
  77. include::{docdir}/ingest/apis/enrich/put-enrich-policy.asciidoc[tag=enrich-policy-def]
  78. [source,console]
  79. ----
  80. PUT /_enrich/policy/users-policy
  81. {
  82. "match": {
  83. "indices": "users",
  84. "match_field": "email",
  85. "enrich_fields": ["first_name", "last_name", "city", "zip", "state"]
  86. }
  87. }
  88. ----
  89. // TEST[continued]
  90. [float]
  91. [[execute-enrich-policy]]
  92. ==== Execute an enrich policy
  93. Use the <<execute-enrich-policy-api,execute enrich policy API>>
  94. to create an enrich index for the policy.
  95. include::apis/enrich/execute-enrich-policy.asciidoc[tag=execute-enrich-policy-def]
  96. The following request executes the `users-policy` enrich policy.
  97. Because this API request performs several operations,
  98. it may take a while to return a response.
  99. [source,console]
  100. ----
  101. POST /_enrich/policy/users-policy/_execute
  102. ----
  103. // TEST[continued]
  104. [float]
  105. [[add-enrich-processor]]
  106. ==== Add the enrich processor to an ingest pipeline
  107. Use the <<put-pipeline-api,put pipeline API>>
  108. to create an ingest pipeline.
  109. Include an <<enrich-processor,enrich processor>>
  110. that uses your enrich policy.
  111. When defining an enrich processor,
  112. you must include the following:
  113. * The field used to match incoming documents
  114. to documents in the enrich index.
  115. +
  116. This field should be included in incoming documents.
  117. * The target field added to incoming documents.
  118. This field contains all appended enrich data.
  119. The following request adds a new pipeline, `user_lookup`.
  120. This pipeline includes an enrich processor
  121. that uses the `users-policy` enrich policy.
  122. [source,console]
  123. ----
  124. PUT /_ingest/pipeline/user_lookup
  125. {
  126. "description" : "Enriching user details to messages",
  127. "processors" : [
  128. {
  129. "enrich" : {
  130. "policy_name": "users-policy",
  131. "field" : "email",
  132. "target_field": "user"
  133. }
  134. }
  135. ]
  136. }
  137. ----
  138. // TEST[continued]
  139. You also can add other <<ingest-processors,processors>>
  140. to your ingest pipeline.
  141. You can use these processors to change or drop incoming documents
  142. based on your criteria.
  143. See <<ingest-processors>> for a list of built-in processors.
  144. [float]
  145. [[ingest-enrich-docs]]
  146. ==== Ingest and enrich documents
  147. Index incoming documents using your ingest pipeline.
  148. Because the enrich policy type is `match`,
  149. the enrich processor matches incoming documents
  150. to documents in the enrich index
  151. based on match field values.
  152. The processor then appends the enrich field data
  153. from any matching document in the enrich index
  154. to target field of the incoming document.
  155. The enrich processor appends all data to the target field as an array.
  156. If the incoming document matches more than one document in the enrich index,
  157. the processor appends data from those documents to the array.
  158. If the incoming document matches no documents in the enrich index,
  159. the processor appends no data.
  160. The following <<docs-index_,index API>> request uses the ingest pipeline
  161. to index a document
  162. containing the `email` field
  163. specified in the enrich processor.
  164. [source,console]
  165. ----
  166. PUT /my_index/_doc/my_id?pipeline=user_lookup
  167. {
  168. "email": "mardy.brown@asciidocsmith.com"
  169. }
  170. ----
  171. // TEST[continued]
  172. To verify the enrich processor matched
  173. and appended the appropriate field data,
  174. use the <<docs-get,get API>> to view the indexed document.
  175. [source,console]
  176. ----
  177. GET /my_index/_doc/my_id
  178. ----
  179. // TEST[continued]
  180. The API returns the following response:
  181. [source,console-result]
  182. ----
  183. {
  184. "found": true,
  185. "_index": "my_index",
  186. "_id": "my_id",
  187. "_version": 1,
  188. "_seq_no": 55,
  189. "_primary_term": 1,
  190. "_source": {
  191. "user": [
  192. {
  193. "email": "mardy.brown@asciidocsmith.com",
  194. "first_name": "Mardy",
  195. "last_name": "Brown",
  196. "zip": 70116,
  197. "city": "New Orleans",
  198. "state": "LA"
  199. }
  200. ],
  201. "email": "mardy.brown@asciidocsmith.com"
  202. }
  203. }
  204. ----
  205. // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
  206. [float]
  207. [[update-enrich-data]]
  208. === Update your enrich index
  209. include::{docdir}/ingest/apis/enrich/execute-enrich-policy.asciidoc[tag=update-enrich-index]
  210. If wanted, you can <<docs-reindex,reindex>>
  211. or <<docs-update-by-query,update>> any already ingested documents
  212. using your ingest pipeline.
  213. [float]
  214. [[update-enrich-policies]]
  215. === Update an enrich policy
  216. include::apis/enrich/put-enrich-policy.asciidoc[tag=update-enrich-policy]
  217. ////
  218. [source,console]
  219. --------------------------------------------------
  220. DELETE /_ingest/pipeline/user_lookup
  221. DELETE /_enrich/policy/users-policy
  222. --------------------------------------------------
  223. // TEST[continued]
  224. ////