|
|
@@ -1,16 +1,18 @@
|
|
|
[discrete]
|
|
|
[[esql-stats-by]]
|
|
|
-=== `STATS ... BY`
|
|
|
+=== `STATS`
|
|
|
|
|
|
-The `STATS ... BY` processing command groups rows according to a common value
|
|
|
+The `STATS` processing command groups rows according to a common value
|
|
|
and calculates one or more aggregated values over the grouped rows.
|
|
|
|
|
|
**Syntax**
|
|
|
|
|
|
[source,esql]
|
|
|
----
|
|
|
-STATS [column1 =] expression1[, ..., [columnN =] expressionN]
|
|
|
-[BY grouping_expression1[, ..., grouping_expressionN]]
|
|
|
+STATS [column1 =] expression1 [WHERE boolean_expression1][,
|
|
|
+ ...,
|
|
|
+ [columnN =] expressionN [WHERE boolean_expressionN]]
|
|
|
+ [BY grouping_expression1[, ..., grouping_expressionN]]
|
|
|
----
|
|
|
|
|
|
*Parameters*
|
|
|
@@ -28,14 +30,18 @@ An expression that computes an aggregated value.
|
|
|
An expression that outputs the values to group by.
|
|
|
If its name coincides with one of the computed columns, that column will be ignored.
|
|
|
|
|
|
+`boolean_expressionX`::
|
|
|
+The condition that must be met for a row to be included in the evaluation of `expressionX`.
|
|
|
+
|
|
|
NOTE: Individual `null` values are skipped when computing aggregations.
|
|
|
|
|
|
*Description*
|
|
|
|
|
|
-The `STATS ... BY` processing command groups rows according to a common value
|
|
|
-and calculate one or more aggregated values over the grouped rows. If `BY` is
|
|
|
-omitted, the output table contains exactly one row with the aggregations applied
|
|
|
-over the entire dataset.
|
|
|
+The `STATS` processing command groups rows according to a common value
|
|
|
+and calculates one or more aggregated values over the grouped rows. For the
|
|
|
+calculation of each aggregated value, the rows in a group can be filtered with
|
|
|
+`WHERE`. If `BY` is omitted, the output table contains exactly one row with
|
|
|
+the aggregations applied over the entire dataset.
|
|
|
|
|
|
The following <<esql-agg-functions,aggregation functions>> are supported:
|
|
|
|
|
|
@@ -90,6 +96,29 @@ include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues]
|
|
|
include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues-result]
|
|
|
|===
|
|
|
|
|
|
+To filter the rows that go into an aggregation, use the `WHERE` clause:
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+include::{esql-specs}/stats.csv-spec[tag=aggFiltering]
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+include::{esql-specs}/stats.csv-spec[tag=aggFiltering-result]
|
|
|
+|===
|
|
|
+
|
|
|
+The aggregations can be mixed, with and without a filter and grouping is
|
|
|
+optional as well:
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+include::{esql-specs}/stats.csv-spec[tag=aggFilteringNoGroup]
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+include::{esql-specs}/stats.csv-spec[tag=aggFilteringNoGroup-result]
|
|
|
+|===
|
|
|
+
|
|
|
[[esql-stats-mv-group]]
|
|
|
If the grouping key is multivalued then the input row is in all groups:
|
|
|
|
|
|
@@ -109,7 +138,7 @@ It's also possible to group by multiple values:
|
|
|
include::{esql-specs}/stats.csv-spec[tag=statsGroupByMultipleValues]
|
|
|
----
|
|
|
|
|
|
-If the all grouping keys are multivalued then the input row is in all groups:
|
|
|
+If all the grouping keys are multivalued then the input row is in all groups:
|
|
|
|
|
|
[source.merge.styled,esql]
|
|
|
----
|
|
|
@@ -121,7 +150,7 @@ include::{esql-specs}/stats.csv-spec[tag=multi-mv-group-result]
|
|
|
|===
|
|
|
|
|
|
Both the aggregating functions and the grouping expressions accept other
|
|
|
-functions. This is useful for using `STATS...BY` on multivalue columns.
|
|
|
+functions. This is useful for using `STATS` on multivalue columns.
|
|
|
For example, to calculate the average salary change, you can use `MV_AVG` to
|
|
|
first average the multiple values per employee, and use the result with the
|
|
|
`AVG` function:
|