```yaml {applies_to} serverless: ga stack: ga ``` The `STATS` processing command groups rows according to a common value and calculates one or more aggregated values over the grouped rows. **Syntax** ```esql STATS [column1 =] expression1 [WHERE boolean_expression1][, ..., [columnN =] expressionN [WHERE boolean_expressionN]] [BY grouping_expression1[, ..., grouping_expressionN]] ``` **Parameters** `columnX` : The name by which the aggregated value is returned. If omitted, the name is equal to the corresponding expression (`expressionX`). If multiple columns have the same name, all but the rightmost column with this name will be ignored. `expressionX` : An expression that computes an aggregated value. `grouping_expressionX` : An expression that outputs the values to group by. If its name coincides with one of the computed columns, that column will be ignored. `boolean_expressionX` : The condition that must be met for a row to be included in the evaluation of `expressionX`. ::::{note} Individual `null` values are skipped when computing aggregations. :::: **Description** The `STATS` processing command groups rows according to a common value and calculates one or more aggregated values over the grouped rows. For the calculation of each aggregated value, the rows in a group can be filtered with `WHERE`. If `BY` is omitted, the output table contains exactly one row with the aggregations applied over the entire dataset. The following [aggregation functions](/reference/query-languages/esql/functions-operators/aggregation-functions.md) are supported: :::{include} ../../lists/aggregation-functions.md ::: The following [grouping functions](/reference/query-languages/esql/functions-operators/grouping-functions.md) are supported: :::{include} ../../lists/grouping-functions.md ::: ::::{note} `STATS` without any groups is much much faster than adding a group. :::: ::::{note} Grouping on a single expression is currently much more optimized than grouping on many expressions. In some tests we have seen grouping on a single `keyword` column to be five times faster than grouping on two `keyword` columns. Do not try to work around this by combining the two columns together with something like [`CONCAT`](/reference/query-languages/esql/functions-operators/string-functions.md#esql-concat) and then grouping - that is not going to be faster. :::: **Examples** Calculating a statistic and grouping by the values of another column: :::{include} ../examples/stats.csv-spec/stats.md ::: Omitting `BY` returns one row with the aggregations applied over the entire dataset: :::{include} ../examples/stats.csv-spec/statsWithoutBy.md ::: It’s possible to calculate multiple values: :::{include} ../examples/stats.csv-spec/statsCalcMultipleValues.md ::: To filter the rows that go into an aggregation, use the `WHERE` clause: :::{include} ../examples/stats.csv-spec/aggFiltering.md ::: The aggregations can be mixed, with and without a filter and grouping is optional as well: :::{include} ../examples/stats.csv-spec/aggFilteringNoGroup.md ::: $$$esql-stats-mv-group$$$ If the grouping key is multivalued then the input row is in all groups: :::{include} ../examples/stats.csv-spec/mv-group.md ::: It’s also possible to group by multiple values: :::{include} ../examples/stats.csv-spec/statsGroupByMultipleValues.md ::: If all the grouping keys are multivalued then the input row is in all groups: :::{include} ../examples/stats.csv-spec/multi-mv-group.md ::: Both the aggregating functions and the grouping expressions accept other functions. This is useful for using `STATS` on multivalue columns. For example, to calculate the average salary change, you can use `MV_AVG` to first average the multiple values per employee, and use the result with the `AVG` function: :::{include} ../examples/stats.csv-spec/docsStatsAvgNestedExpression.md ::: An example of grouping by an expression is grouping employees on the first letter of their last name: :::{include} ../examples/stats.csv-spec/docsStatsByExpression.md ::: Specifying the output column name is optional. If not specified, the new column name is equal to the expression. The following query returns a column named `AVG(salary)`: :::{include} ../examples/stats.csv-spec/statsUnnamedColumn.md ::: Because this name contains special characters, [it needs to be quoted](/reference/query-languages/esql/esql-syntax.md#esql-identifiers) with backticks (```) when using it in subsequent commands: :::{include} ../examples/stats.csv-spec/statsUnnamedColumnEval.md :::