auto_bucket.asciidoc 3.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120
  1. [discrete]
  2. [[esql-auto_bucket]]
  3. === `AUTO_BUCKET`
  4. experimental::[]
  5. *Syntax*
  6. [source,esql]
  7. ----
  8. AUTO_BUCKET(expression, buckets, from, to)
  9. ----
  10. *Parameters*
  11. `field`::
  12. Numeric or date expression from which to derive buckets.
  13. `buckets`::
  14. Target number of buckets.
  15. `from`::
  16. Start of the range. Can be a number or a date expressed as a string.
  17. `to`::
  18. End of the range. Can be a number or a date expressed as a string.
  19. *Description*
  20. Creates human-friendly buckets and returns a value for each row that corresponds
  21. to the resulting bucket the row falls into.
  22. Using a target number of buckets, a start of a range, and an end of a range,
  23. `AUTO_BUCKET` picks an appropriate bucket size to generate the target number of
  24. buckets or fewer. For example, asking for at most 20 buckets over a year results
  25. in monthly buckets:
  26. [source.merge.styled,esql]
  27. ----
  28. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketMonth]
  29. ----
  30. [%header.monospaced.styled,format=dsv,separator=|]
  31. |===
  32. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketMonth-result]
  33. |===
  34. The goal isn't to provide *exactly* the target number of buckets, it's to pick a
  35. range that people are comfortable with that provides at most the target number
  36. of buckets.
  37. Combine `AUTO_BUCKET` with
  38. <<esql-stats-by>> to create a histogram:
  39. [source.merge.styled,esql]
  40. ----
  41. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketMonthlyHistogram]
  42. ----
  43. [%header.monospaced.styled,format=dsv,separator=|]
  44. |===
  45. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketMonthlyHistogram-result]
  46. |===
  47. NOTE: `AUTO_BUCKET` does not create buckets that don't match any documents.
  48. That's why this example is missing `1985-03-01` and other dates.
  49. Asking for more buckets can result in a smaller range. For example, asking for
  50. at most 100 buckets in a year results in weekly buckets:
  51. [source.merge.styled,esql]
  52. ----
  53. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketWeeklyHistogram]
  54. ----
  55. [%header.monospaced.styled,format=dsv,separator=|]
  56. |===
  57. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketWeeklyHistogram-result]
  58. |===
  59. NOTE: `AUTO_BUCKET` does not filter any rows. It only uses the provided range to
  60. pick a good bucket size. For rows with a value outside of the range, it returns
  61. a bucket value that corresponds to a bucket outside the range. Combine
  62. `AUTO_BUCKET` with <<esql-where>> to filter rows.
  63. `AUTO_BUCKET` can also operate on numeric fields. For example, to create a
  64. salary histogram:
  65. [source.merge.styled,esql]
  66. ----
  67. include::{esql-specs}/ints.csv-spec[tag=docsAutoBucketNumeric]
  68. ----
  69. [%header.monospaced.styled,format=dsv,separator=|]
  70. |===
  71. include::{esql-specs}/ints.csv-spec[tag=docsAutoBucketNumeric-result]
  72. |===
  73. Unlike the earlier example that intentionally filters on a date range, you
  74. rarely want to filter on a numeric range. You have to find the `min` and `max`
  75. separately. {esql} doesn't yet have an easy way to do that automatically.
  76. *Examples*
  77. Create hourly buckets for the last 24 hours, and calculate the number of events
  78. per hour:
  79. [source.styled,esql]
  80. ----
  81. include::{esql-specs}/date.csv-spec[tag=docsAutoBucketLast24hr]
  82. ----
  83. Create monthly buckets for the year 1985, and calculate the average salary by
  84. hiring month:
  85. [source.merge.styled,esql]
  86. ----
  87. include::{esql-specs}/date.csv-spec[tag=auto_bucket_in_agg]
  88. ----
  89. [%header.monospaced.styled,format=dsv,separator=|]
  90. |===
  91. include::{esql-specs}/date.csv-spec[tag=auto_bucket_in_agg-result]
  92. |===