1
0

ecommerce-tutorial.asciidoc 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470
  1. [role="xpack"]
  2. [[ecommerce-transforms]]
  3. = Tutorial: Transforming the eCommerce sample data
  4. <<transforms,{transforms-cap}>> enable you to retrieve information
  5. from an {es} index, transform it, and store it in another index. Let's use the
  6. {kibana-ref}/add-sample-data.html[{kib} sample data] to demonstrate how you can
  7. pivot and summarize your data with {transforms}.
  8. . Verify that your environment is set up properly to use {transforms}. If the
  9. {es} {security-features} are enabled, to complete this tutorial you need a user
  10. that has authority to preview and create {transforms}. You must also have
  11. specific index privileges for the source and destination indices. See
  12. <<transform-setup>>.
  13. . Choose your _source index_.
  14. +
  15. --
  16. In this example, we'll use the eCommerce orders sample data. If you're not
  17. already familiar with the `kibana_sample_data_ecommerce` index, use the
  18. *Revenue* dashboard in {kib} to explore the data. Consider what insights you
  19. might want to derive from this eCommerce data.
  20. --
  21. . Choose the pivot type of {transform} and play with various options for
  22. grouping and aggregating the data.
  23. +
  24. --
  25. There are two types of {transforms}, but first we'll try out _pivoting_ your
  26. data, which involves using at least one field to group it and applying at least
  27. one aggregation. You can preview what the transformed data will look
  28. like, so go ahead and play with it! You can also enable histogram charts to get
  29. a better understanding of the distribution of values in your data.
  30. For example, you might want to group the data by product ID and calculate the
  31. total number of sales for each product and its average price. Alternatively, you
  32. might want to look at the behavior of individual customers and calculate how
  33. much each customer spent in total and how many different categories of products
  34. they purchased. Or you might want to take the currencies or geographies into
  35. consideration. What are the most interesting ways you can transform and
  36. interpret this data?
  37. Go to *Management* > *Stack Management* > *Data* > *Transforms* in {kib} and use
  38. the wizard to create a {transform}:
  39. [role="screenshot"]
  40. image::images/ecommerce-pivot1.png["Creating a simple {transform} in {kib}"]
  41. Group the data by customer ID and add one or more aggregations to learn more
  42. about each customer's orders. For example, let's calculate the sum of products
  43. they purchased, the total price of their purchases, the maximum number of
  44. products that they purchased in a single order, and their total number of orders. We'll accomplish this by using the
  45. <<search-aggregations-metrics-sum-aggregation,`sum` aggregation>> on the
  46. `total_quantity` and `taxless_total_price` fields, the
  47. <<search-aggregations-metrics-max-aggregation,`max` aggregation>> on the
  48. `total_quantity` field, and the
  49. <<search-aggregations-metrics-cardinality-aggregation,`cardinality` aggregation>>
  50. on the `order_id` field:
  51. [role="screenshot"]
  52. image::images/ecommerce-pivot2.png["Adding multiple aggregations to a {transform} in {kib}"]
  53. TIP: If you're interested in a subset of the data, you can optionally include a
  54. <<request-body-search-query,query>> element. In this
  55. example, we've filtered the data so that we're only looking at orders with a
  56. `currency` of `EUR`. Alternatively, we could group the data by that field too.
  57. If you want to use more complex queries, you can create your {dataframe} from a
  58. {kibana-ref}/save-open-search.html[saved search].
  59. If you prefer, you can use the
  60. <<preview-transform,preview {transforms} API>>.
  61. .API example
  62. [%collapsible]
  63. ====
  64. [source,console]
  65. --------------------------------------------------
  66. POST _transform/_preview
  67. {
  68. "source": {
  69. "index": "kibana_sample_data_ecommerce",
  70. "query": {
  71. "bool": {
  72. "filter": {
  73. "term": {"currency": "EUR"}
  74. }
  75. }
  76. }
  77. },
  78. "pivot": {
  79. "group_by": {
  80. "customer_id": {
  81. "terms": {
  82. "field": "customer_id"
  83. }
  84. }
  85. },
  86. "aggregations": {
  87. "total_quantity.sum": {
  88. "sum": {
  89. "field": "total_quantity"
  90. }
  91. },
  92. "taxless_total_price.sum": {
  93. "sum": {
  94. "field": "taxless_total_price"
  95. }
  96. },
  97. "total_quantity.max": {
  98. "max": {
  99. "field": "total_quantity"
  100. }
  101. },
  102. "order_id.cardinality": {
  103. "cardinality": {
  104. "field": "order_id"
  105. }
  106. }
  107. }
  108. }
  109. }
  110. --------------------------------------------------
  111. // TEST[skip:set up sample data]
  112. ====
  113. --
  114. . When you are satisfied with what you see in the preview, create the
  115. {transform}.
  116. +
  117. --
  118. .. Supply a {transform} ID, the name of the destination index and optionally a
  119. description. If the destination index does not exist, it will be created
  120. automatically when you start the {transform}.
  121. .. Decide whether you want the {transform} to run once or continuously. Since
  122. this sample data index is unchanging, let's use the default behavior and just
  123. run the {transform} once. If you want to try it out, however, go ahead and click
  124. on *Continuous mode*. You must choose a field that the {transform} can use to
  125. check which entities have changed. In general, it's a good idea to use the
  126. ingest timestamp field. In this example, however, you can use the `order_date`
  127. field.
  128. .. Optionally, you can configure a retention policy that applies to your
  129. {transform}. Select a date field that is used to identify old documents
  130. in the destination index and provide a maximum age. Documents that are older
  131. than the configured value are removed from the destination index.
  132. [role="screenshot"]
  133. image::images/ecommerce-pivot3.png["Adding transfrom ID and retention policy to a {transform} in {kib}"]
  134. In {kib}, before you finish creating the {transform}, you can copy the preview
  135. {transform} API request to your clipboard. This information is useful later when
  136. you're deciding whether you want to manually create the destination index.
  137. [role="screenshot"]
  138. image::images/ecommerce-pivot4.png["Copy the Dev Console statement of the transform preview to the clipboard"]
  139. If you prefer, you can use the
  140. <<put-transform,create {transforms} API>>.
  141. .API example
  142. [%collapsible]
  143. ====
  144. [source,console]
  145. --------------------------------------------------
  146. PUT _transform/ecommerce-customer-transform
  147. {
  148. "source": {
  149. "index": [
  150. "kibana_sample_data_ecommerce"
  151. ],
  152. "query": {
  153. "bool": {
  154. "filter": {
  155. "term": {
  156. "currency": "EUR"
  157. }
  158. }
  159. }
  160. }
  161. },
  162. "pivot": {
  163. "group_by": {
  164. "customer_id": {
  165. "terms": {
  166. "field": "customer_id"
  167. }
  168. }
  169. },
  170. "aggregations": {
  171. "total_quantity.sum": {
  172. "sum": {
  173. "field": "total_quantity"
  174. }
  175. },
  176. "taxless_total_price.sum": {
  177. "sum": {
  178. "field": "taxless_total_price"
  179. }
  180. },
  181. "total_quantity.max": {
  182. "max": {
  183. "field": "total_quantity"
  184. }
  185. },
  186. "order_id.cardinality": {
  187. "cardinality": {
  188. "field": "order_id"
  189. }
  190. }
  191. }
  192. },
  193. "dest": {
  194. "index": "ecommerce-customers"
  195. },
  196. "retention_policy": {
  197. "time": {
  198. "field": "order_date",
  199. "max_age": "60d"
  200. }
  201. }
  202. }
  203. --------------------------------------------------
  204. // TEST[skip:setup kibana sample data]
  205. ====
  206. --
  207. . Optional: Create the destination index.
  208. +
  209. --
  210. If the destination index does not exist, it is created the first time you start
  211. your {transform}. A pivot transform deduces the mappings for the destination
  212. index from the source indices and the transform aggregations. If there are
  213. fields in the destination index that are derived from scripts (for example,
  214. if you use
  215. <<search-aggregations-metrics-scripted-metric-aggregation,`scripted_metrics`>>
  216. or <<search-aggregations-pipeline-bucket-script-aggregation,`bucket_scripts`>>
  217. aggregations), they're created with <<dynamic-mapping,dynamic mappings>>. You
  218. can use the preview {transform} API to preview the mappings it will use for the
  219. destination index. In {kib}, if you copied the API request to your
  220. clipboard, paste it into the console, then refer to the `generated_dest_index`
  221. object in the API response.
  222. NOTE: {transforms-cap} might have more configuration options provided by the
  223. APIs than the options available in {kib}. For example, you can set an ingest
  224. pipeline for `dest` by calling the <<put-transform>>. For all the {transform}
  225. configuration options, refer to the <<transform-apis,documentation>>.
  226. .API example
  227. [%collapsible]
  228. ====
  229. [source,console-result]
  230. --------------------------------------------------
  231. {
  232. "preview" : [
  233. {
  234. "total_quantity" : {
  235. "max" : 2,
  236. "sum" : 118.0
  237. },
  238. "taxless_total_price" : {
  239. "sum" : 3946.9765625
  240. },
  241. "customer_id" : "10",
  242. "order_id" : {
  243. "cardinality" : 59
  244. }
  245. },
  246. ...
  247. ],
  248. "generated_dest_index" : {
  249. "mappings" : {
  250. "_meta" : {
  251. "_transform" : {
  252. "transform" : "transform-preview",
  253. "version" : {
  254. "created" : "8.0.0"
  255. },
  256. "creation_date_in_millis" : 1621991264061
  257. },
  258. "created_by" : "transform"
  259. },
  260. "properties" : {
  261. "total_quantity.sum" : {
  262. "type" : "double"
  263. },
  264. "total_quantity" : {
  265. "type" : "object"
  266. },
  267. "taxless_total_price" : {
  268. "type" : "object"
  269. },
  270. "taxless_total_price.sum" : {
  271. "type" : "double"
  272. },
  273. "order_id.cardinality" : {
  274. "type" : "long"
  275. },
  276. "customer_id" : {
  277. "type" : "keyword"
  278. },
  279. "total_quantity.max" : {
  280. "type" : "integer"
  281. },
  282. "order_id" : {
  283. "type" : "object"
  284. }
  285. }
  286. },
  287. "settings" : {
  288. "index" : {
  289. "number_of_shards" : "1",
  290. "auto_expand_replicas" : "0-1"
  291. }
  292. },
  293. "aliases" : { }
  294. }
  295. }
  296. --------------------------------------------------
  297. // TESTRESPONSE[skip:needs sample data]
  298. ====
  299. In some instances the deduced mappings might be incompatible with the actual
  300. data. For example, numeric overflows might occur or dynamically mapped fields
  301. might contain both numbers and strings. To avoid this problem, create your
  302. destination index before you start the {transform}. For more information, see
  303. the <<indices-create-index,create index API>>.
  304. .API example
  305. [%collapsible]
  306. ====
  307. You can use the information from the {transform} preview to create the
  308. destination index. For example:
  309. [source,console]
  310. --------------------------------------------------
  311. PUT /ecommerce-customers
  312. {
  313. "mappings": {
  314. "properties": {
  315. "total_quantity.sum" : {
  316. "type" : "double"
  317. },
  318. "total_quantity" : {
  319. "type" : "object"
  320. },
  321. "taxless_total_price" : {
  322. "type" : "object"
  323. },
  324. "taxless_total_price.sum" : {
  325. "type" : "double"
  326. },
  327. "order_id.cardinality" : {
  328. "type" : "long"
  329. },
  330. "customer_id" : {
  331. "type" : "keyword"
  332. },
  333. "total_quantity.max" : {
  334. "type" : "integer"
  335. },
  336. "order_id" : {
  337. "type" : "object"
  338. }
  339. }
  340. }
  341. }
  342. --------------------------------------------------
  343. // TEST
  344. ====
  345. --
  346. . Start the {transform}.
  347. +
  348. --
  349. TIP: Even though resource utilization is automatically adjusted based on the
  350. cluster load, a {transform} increases search and indexing load on your
  351. cluster while it runs. If you're experiencing an excessive load, however, you
  352. can stop it.
  353. You can start, stop, and manage {transforms} in {kib}:
  354. [role="screenshot"]
  355. image::images/manage-transforms.png["Managing {transforms} in {kib}"]
  356. Alternatively, you can use the
  357. <<start-transform,start {transforms}>> and
  358. <<stop-transform,stop {transforms}>> APIs.
  359. .API example
  360. [%collapsible]
  361. ====
  362. [source,console]
  363. --------------------------------------------------
  364. POST _transform/ecommerce-customer-transform/_start
  365. --------------------------------------------------
  366. // TEST[skip:setup kibana sample data]
  367. ====
  368. TIP: If you chose a batch {transform}, it is a single operation that has a
  369. single checkpoint. You cannot restart it when it's complete. {ctransforms-cap}
  370. differ in that they continually increment and process checkpoints as new source
  371. data is ingested.
  372. --
  373. . Explore the data in your new index.
  374. +
  375. --
  376. For example, use the *Discover* application in {kib}:
  377. [role="screenshot"]
  378. image::images/ecommerce-results.png["Exploring the new index in {kib}"]
  379. --
  380. . Optional: Create another {transform}, this time using the `latest` method.
  381. +
  382. --
  383. This method populates the destination index with the latest documents for each
  384. unique key value. For example, you might want to find the latest orders (sorted
  385. by the `order_date` field) for each customer or for each country and region.
  386. [role="screenshot"]
  387. image::images/ecommerce-latest1.png["Creating a latest {transform} in {kib}"]
  388. .API example
  389. [%collapsible]
  390. ====
  391. [source,console]
  392. --------------------------------------------------
  393. POST _transform/_preview
  394. {
  395. "source": {
  396. "index": "kibana_sample_data_ecommerce",
  397. "query": {
  398. "bool": {
  399. "filter": {
  400. "term": {"currency": "EUR"}
  401. }
  402. }
  403. }
  404. },
  405. "latest": {
  406. "unique_key": ["geoip.country_iso_code", "geoip.region_name"],
  407. "sort": "order_date"
  408. }
  409. }
  410. --------------------------------------------------
  411. // TEST[skip:set up sample data]
  412. ====
  413. TIP: If the destination index does not exist, it is created the first time you
  414. start your {transform}. Unlike pivot {transforms}, however, latest {transforms}
  415. do not deduce mapping definitions when they create the index. Instead, they use
  416. dynamic mappings. To use explicit mappings, create the destination index
  417. before you start the {transform}.
  418. --
  419. . If you do not want to keep a {transform}, you can delete it in
  420. {kib} or use the <<delete-transform,delete {transform} API>>. By default, when
  421. you delete a {transform}, its destination index and {kib} index patterns remain.
  422. Now that you've created simple {transforms} for {kib} sample data, consider
  423. possible use cases for your own data. For more ideas, see
  424. <<transform-usage>> and <<transform-examples>>.