put-transform.asciidoc 6.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[put-transform]]
  4. === Create {transforms} API
  5. [subs="attributes"]
  6. ++++
  7. <titleabbrev>Create {transforms}</titleabbrev>
  8. ++++
  9. Instantiates a {transform}.
  10. beta[]
  11. [[put-transform-request]]
  12. ==== {api-request-title}
  13. `PUT _data_frame/transforms/<transform_id>`
  14. [[put-transform-prereqs]]
  15. ==== {api-prereq-title}
  16. * If the {es} {security-features} are enabled, you must have
  17. `manage_data_frame_transforms` cluster privileges to use this API. The built-in
  18. `data_frame_transforms_admin` role has these privileges. You must also
  19. have `read` and `view_index_metadata` privileges on the source index and `read`,
  20. `create_index`, and `index` privileges on the destination index. For more
  21. information, see {stack-ov}/security-privileges.html[Security privileges] and
  22. {stack-ov}/built-in-roles.html[Built-in roles].
  23. [[put-transform-desc]]
  24. ==== {api-description-title}
  25. This API defines a {transform}, which copies data from source indices,
  26. transforms it, and persists it into an entity-centric destination index. The
  27. entities are defined by the set of `group_by` fields in the `pivot` object. You
  28. can also think of the destination index as a two-dimensional tabular data
  29. structure (known as a {dataframe}). The ID for each document in the
  30. {dataframe} is generated from a hash of the entity, so there is a unique row
  31. per entity. For more information, see <<transforms>>.
  32. When the {transform} is created, a series of validations occur to
  33. ensure its success. For example, there is a check for the existence of the
  34. source indices and a check that the destination index is not part of the source
  35. index pattern. You can use the `defer_validation` parameter to skip these
  36. checks.
  37. Deferred validations are always run when the {transform} is started,
  38. with the exception of privilege checks. When {es} {security-features} are
  39. enabled, the {transform} remembers which roles the user that created
  40. it had at the time of creation and uses those same roles. If those roles do not
  41. have the required privileges on the source and destination indices, the
  42. {transform} fails when it attempts unauthorized operations.
  43. IMPORTANT: You must use {kib} or this API to create a {transform}.
  44. Do not put a {transform} directly into any
  45. `.data-frame-internal*` indices using the Elasticsearch index API.
  46. If {es} {security-features} are enabled, do not give users any
  47. privileges on `.data-frame-internal*` indices.
  48. [[put-transform-path-parms]]
  49. ==== {api-path-parms-title}
  50. `<transform_id>`::
  51. (Required, string) Identifier for the {transform}. This identifier
  52. can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and
  53. underscores. It must start and end with alphanumeric characters.
  54. [[put-transform-query-parms]]
  55. ==== {api-query-parms-title}
  56. `defer_validation`::
  57. (Optional, boolean) When `true`, deferrable validations are not run. This
  58. behavior may be desired if the source index does not exist until after the
  59. {transform} is created.
  60. [[put-transform-request-body]]
  61. ==== {api-request-body-title}
  62. `description`::
  63. (Optional, string) Free text description of the {transform}.
  64. `dest`::
  65. (Required, object) Required. The destination configuration, which has the
  66. following properties:
  67. `index`:::
  68. (Required, string) The _destination index_ for the {transform}.
  69. `pipeline`:::
  70. (Optional, string) The unique identifier for a <<pipeline,pipeline>>.
  71. `frequency`::
  72. (Optional, <<time-units, time units>>) The interval between checks for changes in the source
  73. indices when the {transform} is running continuously. Also determines
  74. the retry interval in the event of transient failures while the {transform} is
  75. searching or indexing. The minimum value is `1s` and the maximum is `1h`. The
  76. default value is `1m`.
  77. `pivot`::
  78. (Required, object) Defines the pivot function `group by` fields and the aggregation to
  79. reduce the data. See <<transform-pivot>>.
  80. `source`::
  81. (Required, object) The source configuration, which has the following
  82. properties:
  83. `index`:::
  84. (Required, string or array) The _source indices_ for the
  85. {transform}. It can be a single index, an index pattern (for
  86. example, `"myindex*"`), or an array of indices (for example,
  87. `["index1", "index2"]`).
  88. `query`:::
  89. (Optional, object) A query clause that retrieves a subset of data from the
  90. source index. See <<query-dsl>>.
  91. `sync`::
  92. (Optional, object) Defines the properties required to run continuously.
  93. `time`:::
  94. (Required, object) Specifies that the {transform} uses a time
  95. field to synchronize the source and destination indices.
  96. `field`::::
  97. (Required, string) The date field that is used to identify new documents
  98. in the source.
  99. +
  100. --
  101. TIP: In general, it’s a good idea to use a field that contains the
  102. <<accessing-ingest-metadata,ingest timestamp>>. If you use a different field,
  103. you might need to set the `delay` such that it accounts for data transmission
  104. delays.
  105. --
  106. `delay`::::
  107. (Optional, <<time-units, time units>>) The time delay between the current time and the
  108. latest input data time. The default value is `60s`.
  109. [[put-transform-example]]
  110. ==== {api-examples-title}
  111. [source,console]
  112. --------------------------------------------------
  113. PUT _data_frame/transforms/ecommerce_transform
  114. {
  115. "source": {
  116. "index": "kibana_sample_data_ecommerce",
  117. "query": {
  118. "term": {
  119. "geoip.continent_name": {
  120. "value": "Asia"
  121. }
  122. }
  123. }
  124. },
  125. "pivot": {
  126. "group_by": {
  127. "customer_id": {
  128. "terms": {
  129. "field": "customer_id"
  130. }
  131. }
  132. },
  133. "aggregations": {
  134. "max_price": {
  135. "max": {
  136. "field": "taxful_total_price"
  137. }
  138. }
  139. }
  140. },
  141. "description": "Maximum priced ecommerce data by customer_id in Asia",
  142. "dest": {
  143. "index": "kibana_sample_data_ecommerce_transform",
  144. "pipeline": "add_timestamp_pipeline"
  145. },
  146. "frequency": "5m",
  147. "sync": {
  148. "time": {
  149. "field": "order_date",
  150. "delay": "60s"
  151. }
  152. }
  153. }
  154. --------------------------------------------------
  155. // TEST[setup:kibana_sample_data_ecommerce]
  156. When the {transform} is created, you receive the following results:
  157. [source,console-result]
  158. ----
  159. {
  160. "acknowledged" : true
  161. }
  162. ----