|
@@ -24,7 +24,7 @@ might be returned if their support values are different.
|
|
|
|
|
|
The runtime of the aggregation depends on the data and the provided parameters.
|
|
|
It might take a significant time for the aggregation to complete. For this
|
|
|
-reason, it is recommended to use <<async-search, async search>> to run your
|
|
|
+reason, it is recommended to use <<async-search,async search>> to run your
|
|
|
requests asynchronously.
|
|
|
|
|
|
|
|
@@ -73,7 +73,7 @@ aggregation might require a significant amount of system resources.
|
|
|
The minimum set size is the minimum number of items the set needs to contain. A
|
|
|
value of 1 returns the frequency of single items. Only item sets that contain at
|
|
|
least the number of `minimum_set_size` items are returned. For example, the item
|
|
|
-set `orange, banana, apple` is only returned if the minimum set size is 3 or
|
|
|
+set `orange, banana, apple` is returned only if the minimum set size is 3 or
|
|
|
lower.
|
|
|
|
|
|
[discrete]
|
|
@@ -123,7 +123,7 @@ In the following examples, we use the e-commerce {kib} sample data set.
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
-==== Aggregation with two analized fields
|
|
|
+==== Aggregation with two analyzed fields
|
|
|
|
|
|
In the first example, the goal is to find out based on transaction data (1.)
|
|
|
from what product categories the customers purchase products frequently together
|
|
@@ -131,7 +131,7 @@ and (2.) from which cities they make those purchases. We are interested in sets
|
|
|
with three or more items, and want to see the first three frequent item sets
|
|
|
with the highest support.
|
|
|
|
|
|
-Note that we use the <<async-search, async search>> endpoint in this first
|
|
|
+Note that we use the <<async-search,async search>> endpoint in this first
|
|
|
example.
|
|
|
|
|
|
[source,console]
|
|
@@ -228,8 +228,8 @@ of documents containing the item set by the total number of documents.
|
|
|
The response shows that the categories customers purchase from most frequently
|
|
|
together are `Women's Clothing` and `Women's Shoes` and customers from New York
|
|
|
tend to buy items from these categories frequently togeher. In other words,
|
|
|
-customers who buy products labelled Women's Clothing more likely buy products
|
|
|
-also from the Women's Shoes category and customers from New York most likely buy
|
|
|
+customers who buy products labelled `Women's Clothing` more likely buy products
|
|
|
+also from the `Women's Shoes` category and customers from New York most likely buy
|
|
|
products from these categories together. The item set with the second highest
|
|
|
support is `Women's Clothing` and `Women's Accessories` with customers mostly
|
|
|
from New York. Finally, the item set with the third highest support is
|
|
@@ -269,8 +269,8 @@ POST /kibana_sample_data_ecommerce/_async_search
|
|
|
// TEST[skip:setup kibana sample data]
|
|
|
|
|
|
The result will only show item sets that created from documents matching the
|
|
|
-filter, namely purchases in Europe. Using `filter` the calculated `support` still
|
|
|
-takes all purchases into acount. That's different to specifying a query at the
|
|
|
+filter, namely purchases in Europe. Using `filter`, the calculated `support` still
|
|
|
+takes all purchases into acount. That's different than specifying a query at the
|
|
|
top-level, in which case `support` gets calculated only from purchases in Europe.
|
|
|
|
|
|
|
|
@@ -279,7 +279,7 @@ top-level, in which case `support` gets calculated only from purchases in Europe
|
|
|
|
|
|
The frequent items aggregation enables you to bucket numeric values by using
|
|
|
<<runtime,runtime fields>>. The next example demonstrates how to use a script to
|
|
|
-add a runtime field to your documents that called `price_range` which is
|
|
|
+add a runtime field to your documents called `price_range`, which is
|
|
|
calculated from the taxful total price of the individual transactions. The
|
|
|
runtime field then can be used in the frequent items aggregation as a field to
|
|
|
analyze.
|