put-transform.asciidoc 6.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[put-transform]]
  4. === Create {transforms} API
  5. [subs="attributes"]
  6. ++++
  7. <titleabbrev>Create {transforms}</titleabbrev>
  8. ++++
  9. Instantiates a {transform}.
  10. beta[]
  11. [[put-transform-request]]
  12. ==== {api-request-title}
  13. `PUT _transform/<transform_id>`
  14. [[put-transform-prereqs]]
  15. ==== {api-prereq-title}
  16. * If the {es} {security-features} are enabled, you must have
  17. `manage_data_frame_transforms` cluster privileges to use this API. The built-in
  18. `data_frame_transforms_admin` role has these privileges. You must also
  19. have `read` and `view_index_metadata` privileges on the source index and `read`,
  20. `create_index`, and `index` privileges on the destination index. For more
  21. information, see <<security-privileges>> and <<built-in-roles>>.
  22. [[put-transform-desc]]
  23. ==== {api-description-title}
  24. This API defines a {transform}, which copies data from source indices,
  25. transforms it, and persists it into an entity-centric destination index. The
  26. entities are defined by the set of `group_by` fields in the `pivot` object. You
  27. can also think of the destination index as a two-dimensional tabular data
  28. structure (known as a {dataframe}). The ID for each document in the
  29. {dataframe} is generated from a hash of the entity, so there is a unique row
  30. per entity. For more information, see <<transforms>>.
  31. When the {transform} is created, a series of validations occur to
  32. ensure its success. For example, there is a check for the existence of the
  33. source indices and a check that the destination index is not part of the source
  34. index pattern. You can use the `defer_validation` parameter to skip these
  35. checks.
  36. Deferred validations are always run when the {transform} is started,
  37. with the exception of privilege checks. When {es} {security-features} are
  38. enabled, the {transform} remembers which roles the user that created
  39. it had at the time of creation and uses those same roles. If those roles do not
  40. have the required privileges on the source and destination indices, the
  41. {transform} fails when it attempts unauthorized operations.
  42. IMPORTANT: You must use {kib} or this API to create a {transform}.
  43. Do not put a {transform} directly into any
  44. `.data-frame-internal*` indices using the Elasticsearch index API.
  45. If {es} {security-features} are enabled, do not give users any
  46. privileges on `.data-frame-internal*` indices.
  47. [[put-transform-path-parms]]
  48. ==== {api-path-parms-title}
  49. `<transform_id>`::
  50. (Required, string) Identifier for the {transform}. This identifier
  51. can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and
  52. underscores. It must start and end with alphanumeric characters.
  53. [[put-transform-query-parms]]
  54. ==== {api-query-parms-title}
  55. `defer_validation`::
  56. (Optional, boolean) When `true`, deferrable validations are not run. This
  57. behavior may be desired if the source index does not exist until after the
  58. {transform} is created.
  59. [[put-transform-request-body]]
  60. ==== {api-request-body-title}
  61. `description`::
  62. (Optional, string) Free text description of the {transform}.
  63. `dest`::
  64. (Required, object) Required. The destination configuration, which has the
  65. following properties:
  66. `index`:::
  67. (Required, string) The _destination index_ for the {transform}.
  68. `pipeline`:::
  69. (Optional, string) The unique identifier for a <<pipeline,pipeline>>.
  70. `frequency`::
  71. (Optional, <<time-units, time units>>) The interval between checks for changes in the source
  72. indices when the {transform} is running continuously. Also determines
  73. the retry interval in the event of transient failures while the {transform} is
  74. searching or indexing. The minimum value is `1s` and the maximum is `1h`. The
  75. default value is `1m`.
  76. `pivot`::
  77. (Required, object) Defines the pivot function `group by` fields and the aggregation to
  78. reduce the data. See <<transform-pivot>>.
  79. `source`::
  80. (Required, object) The source configuration, which has the following
  81. properties:
  82. `index`:::
  83. (Required, string or array) The _source indices_ for the
  84. {transform}. It can be a single index, an index pattern (for
  85. example, `"myindex*"`), or an array of indices (for example,
  86. `["index1", "index2"]`).
  87. `query`:::
  88. (Optional, object) A query clause that retrieves a subset of data from the
  89. source index. See <<query-dsl>>.
  90. `sync`::
  91. (Optional, object) Defines the properties required to run continuously.
  92. `time`:::
  93. (Required, object) Specifies that the {transform} uses a time
  94. field to synchronize the source and destination indices.
  95. `field`::::
  96. (Required, string) The date field that is used to identify new documents
  97. in the source.
  98. +
  99. --
  100. TIP: In general, it’s a good idea to use a field that contains the
  101. <<accessing-ingest-metadata,ingest timestamp>>. If you use a different field,
  102. you might need to set the `delay` such that it accounts for data transmission
  103. delays.
  104. --
  105. `delay`::::
  106. (Optional, <<time-units, time units>>) The time delay between the current time and the
  107. latest input data time. The default value is `60s`.
  108. [[put-transform-example]]
  109. ==== {api-examples-title}
  110. [source,console]
  111. --------------------------------------------------
  112. PUT _transform/ecommerce_transform
  113. {
  114. "source": {
  115. "index": "kibana_sample_data_ecommerce",
  116. "query": {
  117. "term": {
  118. "geoip.continent_name": {
  119. "value": "Asia"
  120. }
  121. }
  122. }
  123. },
  124. "pivot": {
  125. "group_by": {
  126. "customer_id": {
  127. "terms": {
  128. "field": "customer_id"
  129. }
  130. }
  131. },
  132. "aggregations": {
  133. "max_price": {
  134. "max": {
  135. "field": "taxful_total_price"
  136. }
  137. }
  138. }
  139. },
  140. "description": "Maximum priced ecommerce data by customer_id in Asia",
  141. "dest": {
  142. "index": "kibana_sample_data_ecommerce_transform",
  143. "pipeline": "add_timestamp_pipeline"
  144. },
  145. "frequency": "5m",
  146. "sync": {
  147. "time": {
  148. "field": "order_date",
  149. "delay": "60s"
  150. }
  151. }
  152. }
  153. --------------------------------------------------
  154. // TEST[setup:kibana_sample_data_ecommerce]
  155. When the {transform} is created, you receive the following results:
  156. [source,console-result]
  157. ----
  158. {
  159. "acknowledged" : true
  160. }
  161. ----