count-distinct.asciidoc 1.8 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
  1. [discrete]
  2. [[esql-agg-count-distinct]]
  3. === `COUNT_DISTINCT`
  4. *Syntax*
  5. [source,esql]
  6. ----
  7. COUNT_DISTINCT(column[, precision])
  8. ----
  9. *Parameters*
  10. `column`::
  11. Column for which to count the number of distinct values.
  12. `precision`::
  13. Precision. Refer to <<esql-agg-count-distinct-approximate>>.
  14. *Description*
  15. Returns the approximate number of distinct values.
  16. [discrete]
  17. [[esql-agg-count-distinct-approximate]]
  18. ==== Counts are approximate
  19. Computing exact counts requires loading values into a set and returning its
  20. size. This doesn't scale when working on high-cardinality sets and/or large
  21. values as the required memory usage and the need to communicate those
  22. per-shard sets between nodes would utilize too many resources of the cluster.
  23. This `COUNT_DISTINCT` function is based on the
  24. https://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/40671.pdf[HyperLogLog++]
  25. algorithm, which counts based on the hashes of the values with some interesting
  26. properties:
  27. include::../../aggregations/metrics/cardinality-aggregation.asciidoc[tag=explanation]
  28. The `COUNT_DISTINCT` function takes an optional second parameter to configure the
  29. precision.
  30. *Supported types*
  31. Can take any field type as input.
  32. *Examples*
  33. [source.merge.styled,esql]
  34. ----
  35. include::{esql-specs}/stats_count_distinct.csv-spec[tag=count-distinct]
  36. ----
  37. [%header.monospaced.styled,format=dsv,separator=|]
  38. |===
  39. include::{esql-specs}/stats_count_distinct.csv-spec[tag=count-distinct-result]
  40. |===
  41. With the optional second parameter to configure the precision:
  42. [source.merge.styled,esql]
  43. ----
  44. include::{esql-specs}/stats_count_distinct.csv-spec[tag=count-distinct-precision]
  45. ----
  46. [%header.monospaced.styled,format=dsv,separator=|]
  47. |===
  48. include::{esql-specs}/stats_count_distinct.csv-spec[tag=count-distinct-precision-result]
  49. |===