123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262 |
- [role="xpack"]
- [testenv="basic"]
- [[ecommerce-transforms]]
- === Tutorial: Transforming the eCommerce sample data
- beta[]
- <<transforms,{transforms-cap}>> enable you to retrieve information
- from an {es} index, transform it, and store it in another index. Let's use the
- {kibana-ref}/add-sample-data.html[{kib} sample data] to demonstrate how you can
- pivot and summarize your data with {transforms}.
- . If the {es} {security-features} are enabled, obtain a user ID with sufficient
- privileges to complete these steps.
- +
- --
- You need `manage_data_frame_transforms` cluster privileges to preview and create
- {transforms}. Members of the built-in `data_frame_transforms_admin`
- role have these privileges.
- You also need `read` and `view_index_metadata` index privileges on the source
- index and `read`, `create_index`, and `index` privileges on the destination
- index.
- For more information, see
- {stack-ov}/security-privileges.html[Security privileges] and
- {stack-ov}/built-in-roles.html[Built-in roles].
- --
- . Choose your _source index_.
- +
- --
- In this example, we'll use the eCommerce orders sample data. If you're not
- already familiar with the `kibana_sample_data_ecommerce` index, use the
- *Revenue* dashboard in {kib} to explore the data. Consider what insights you
- might want to derive from this eCommerce data.
- --
- . Play with various options for grouping and aggregating the data.
- +
- --
- _Pivoting_ your data involves using at least one field to group it and applying
- at least one aggregation. You can preview what the transformed data will look
- like, so go ahead and play with it!
- For example, you might want to group the data by product ID and calculate the
- total number of sales for each product and its average price. Alternatively, you
- might want to look at the behavior of individual customers and calculate how
- much each customer spent in total and how many different categories of products
- they purchased. Or you might want to take the currencies or geographies into
- consideration. What are the most interesting ways you can transform and
- interpret this data?
- Go to *Machine Learning* > *Data Frames* in {kib} and use the
- wizard to create a {transform}:
- [role="screenshot"]
- image::images/ecommerce-pivot1.jpg["Creating a simple {transform} in {kib}"]
- In this case, we grouped the data by customer ID and calculated the sum of
- products each customer purchased.
- Let's add some more aggregations to learn more about our customers' orders. For
- example, let's calculate the total sum of their purchases, the maximum number of
- products that they purchased in a single order, and their total number of orders.
- We'll accomplish this by using the
- {ref}/search-aggregations-metrics-sum-aggregation.html[`sum` aggregation] on the
- `taxless_total_price` field, the
- {ref}/search-aggregations-metrics-max-aggregation.html[`max` aggregation] on the
- `total_quantity` field, and the
- {ref}/search-aggregations-metrics-cardinality-aggregation.html[`cardinality` aggregation]
- on the `order_id` field:
- [role="screenshot"]
- image::images/ecommerce-pivot2.jpg["Adding multiple aggregations to a {transform} in {kib}"]
- TIP: If you're interested in a subset of the data, you can optionally include a
- {ref}/search-request-body.html#request-body-search-query[query] element. In this
- example, we've filtered the data so that we're only looking at orders with a
- `currency` of `EUR`. Alternatively, we could group the data by that field too.
- If you want to use more complex queries, you can create your {dataframe} from a
- {kibana-ref}/save-open-search.html[saved search].
- If you prefer, you can use the
- {ref}/preview-transform.html[preview {transforms} API]:
- [source,console]
- --------------------------------------------------
- POST _data_frame/transforms/_preview
- {
- "source": {
- "index": "kibana_sample_data_ecommerce",
- "query": {
- "bool": {
- "filter": {
- "term": {"currency": "EUR"}
- }
- }
- }
- },
- "pivot": {
- "group_by": {
- "customer_id": {
- "terms": {
- "field": "customer_id"
- }
- }
- },
- "aggregations": {
- "total_quantity.sum": {
- "sum": {
- "field": "total_quantity"
- }
- },
- "taxless_total_price.sum": {
- "sum": {
- "field": "taxless_total_price"
- }
- },
- "total_quantity.max": {
- "max": {
- "field": "total_quantity"
- }
- },
- "order_id.cardinality": {
- "cardinality": {
- "field": "order_id"
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:set up sample data]
- --
- . When you are satisfied with what you see in the preview, create the
- {transform}.
- +
- --
- .. Supply a job ID and the name of the target (or _destination_) index. If the
- target index does not exist, it will be created automatically.
- .. Decide whether you want the {transform} to run once or continuously.
- --
- +
- --
- Since this sample data index is unchanging, let's use the default behavior and
- just run the {transform} once.
- [role="screenshot"]
- image::images/ecommerce-batch.jpg["Specifying the {transform} options in {kib}"]
- If you want to try it out, however, go ahead and click on *Continuous mode*.
- You must choose a field that the {transform} can use to check which
- entities have changed. In general, it's a good idea to use the ingest timestamp
- field. In this example, however, you can use the `order_date` field.
- If you prefer, you can use the
- {ref}/put-transform.html[create {transforms} API]. For
- example:
- [source,console]
- --------------------------------------------------
- PUT _data_frame/transforms/ecommerce-customer-transform
- {
- "source": {
- "index": [
- "kibana_sample_data_ecommerce"
- ],
- "query": {
- "bool": {
- "filter": {
- "term": {
- "currency": "EUR"
- }
- }
- }
- }
- },
- "pivot": {
- "group_by": {
- "customer_id": {
- "terms": {
- "field": "customer_id"
- }
- }
- },
- "aggregations": {
- "total_quantity.sum": {
- "sum": {
- "field": "total_quantity"
- }
- },
- "taxless_total_price.sum": {
- "sum": {
- "field": "taxless_total_price"
- }
- },
- "total_quantity.max": {
- "max": {
- "field": "total_quantity"
- }
- },
- "order_id.cardinality": {
- "cardinality": {
- "field": "order_id"
- }
- }
- }
- },
- "dest": {
- "index": "ecommerce-customers"
- }
- }
- --------------------------------------------------
- // TEST[skip:setup kibana sample data]
- --
- . Start the {transform}.
- +
- --
- TIP: Even though resource utilization is automatically adjusted based on the
- cluster load, a {transform} increases search and indexing load on your
- cluster while it runs. If you're experiencing an excessive load, however, you
- can stop it.
- You can start, stop, and manage {transforms} in {kib}:
- [role="screenshot"]
- image::images/dataframe-transforms.jpg["Managing {transforms} in {kib}"]
- Alternatively, you can use the
- {ref}/start-transform.html[start {transforms}] and
- {ref}/stop-transform.html[stop {transforms}] APIs. For
- example:
- [source,console]
- --------------------------------------------------
- POST _data_frame/transforms/ecommerce-customer-transform/_start
- --------------------------------------------------
- // TEST[skip:setup kibana sample data]
- --
- . Explore the data in your new index.
- +
- --
- For example, use the *Discover* application in {kib}:
- [role="screenshot"]
- image::images/ecommerce-results.jpg["Exploring the new index in {kib}"]
- --
- TIP: If you do not want to keep the {transform}, you can delete it in
- {kib} or use the
- {ref}/delete-transform.html[delete {transform} API]. When
- you delete a {transform}, its destination index and {kib} index
- patterns remain.
|