esql-get-started.asciidoc 9.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320
  1. [[esql-getting-started]]
  2. == Getting started with {esql} queries
  3. ++++
  4. <titleabbrev>Getting started</titleabbrev>
  5. ++++
  6. This guide shows how you can use {esql} to query and aggregate your data.
  7. [discrete]
  8. [[esql-getting-started-prerequisites]]
  9. === Prerequisites
  10. To follow along with the queries in this guide, you can either set up your own
  11. deployment, or use Elastic's public {esql} demo environment.
  12. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-sample-data.asciidoc[]
  13. [discrete]
  14. [[esql-getting-started-running-queries]]
  15. === Run an {esql} query
  16. In {kib}, you can use Console or Discover to run {esql} queries:
  17. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-discover-console.asciidoc[]
  18. [discrete]
  19. [[esql-getting-started-first-query]]
  20. === Your first {esql} query
  21. Each {esql} query starts with a <<esql-source-commands,source command>>. A
  22. source command produces a table, typically with data from {es}.
  23. image::images/esql/source-command.svg[A source command producing a table from {es},align="center"]
  24. The <<esql-from>> source command returns a table with documents from a data
  25. stream, index, or alias. Each row in the resulting table represents a document.
  26. This query returns up to 500 documents from the `sample_data` index:
  27. [source,esql]
  28. ----
  29. include::{esql-specs}/docs.csv-spec[tag=gs-from]
  30. ----
  31. Each column corresponds to a field, and can be accessed by the name of that
  32. field.
  33. [TIP]
  34. ====
  35. {esql} keywords are case-insensitive. The following query is identical to the
  36. previous one:
  37. [source,esql]
  38. ----
  39. include::{esql-specs}/docs.csv-spec[tag=gs-from-lowercase]
  40. ----
  41. ====
  42. [discrete]
  43. [[esql-getting-started-limit]]
  44. === Processing commands
  45. A source command can be followed by one or more
  46. <<esql-processing-commands,processing commands>>, separated by a pipe character:
  47. `|`. Processing commands change an input table by adding, removing, or changing
  48. rows and columns. Processing commands can perform filtering, projection,
  49. aggregation, and more.
  50. image::images/esql/esql-limit.png[A processing command changing an input table,align="center",width="60%"]
  51. For example, you can use the <<esql-limit>> command to limit the number of rows
  52. that are returned, up to a maximum of 10,000 rows:
  53. [source,esql]
  54. ----
  55. include::{esql-specs}/docs.csv-spec[tag=gs-limit]
  56. ----
  57. [TIP]
  58. ====
  59. For readability, you can put each command on a separate line. However, you don't
  60. have to. The following query is identical to the previous one:
  61. [source,esql]
  62. ----
  63. include::{esql-specs}/docs.csv-spec[tag=gs-limit-one-line]
  64. ----
  65. ====
  66. [discrete]
  67. [[esql-getting-started-sort]]
  68. ==== Sort a table
  69. image::images/esql/esql-sort.png[A processing command sorting an input table,align="center",width="60%"]
  70. Another processing command is the <<esql-sort>> command. By default, the rows
  71. returned by `FROM` don't have a defined sort order. Use the `SORT` command to
  72. sort rows on one or more columns:
  73. [source,esql]
  74. ----
  75. include::{esql-specs}/docs.csv-spec[tag=gs-sort]
  76. ----
  77. [discrete]
  78. [[esql-getting-started-where]]
  79. ==== Query the data
  80. Use the <<esql-where>> command to query the data. For example, to find all
  81. events with a duration longer than 5ms:
  82. [source,esql]
  83. ----
  84. include::{esql-specs}/where.csv-spec[tag=gs-where]
  85. ----
  86. `WHERE` supports several <<esql-operators,operators>>. For example, you can use <<esql-like-operator>> to run a wildcard query against the `message` column:
  87. [source,esql]
  88. ----
  89. include::{esql-specs}/where-like.csv-spec[tag=gs-like]
  90. ----
  91. [discrete]
  92. [[esql-getting-started-more-commands]]
  93. ==== More processing commands
  94. There are many other processing commands, like <<esql-keep>> and <<esql-drop>>
  95. to keep or drop columns, <<esql-enrich>> to enrich a table with data from
  96. indices in {es}, and <<esql-dissect>> and <<esql-grok>> to process data. Refer
  97. to <<esql-processing-commands>> for an overview of all processing commands.
  98. [discrete]
  99. [[esql-getting-started-chaining]]
  100. === Chain processing commands
  101. You can chain processing commands, separated by a pipe character: `|`. Each
  102. processing command works on the output table of the previous command. The result
  103. of a query is the table produced by the final processing command.
  104. image::images/esql/esql-sort-limit.png[Processing commands can be chained,align="center"]
  105. The following example first sorts the table on `@timestamp`, and next limits the
  106. result set to 3 rows:
  107. [source,esql]
  108. ----
  109. include::{esql-specs}/docs.csv-spec[tag=gs-chaining]
  110. ----
  111. NOTE: The order of processing commands is important. First limiting the result
  112. set to 3 rows before sorting those 3 rows would most likely return a result that
  113. is different than this example, where the sorting comes before the limit.
  114. [discrete]
  115. [[esql-getting-started-eval]]
  116. === Compute values
  117. Use the <<esql-eval>> command to append columns to a table, with calculated
  118. values. For example, the following query appends a `duration_ms` column. The
  119. values in the column are computed by dividing `event_duration` by 1,000,000. In
  120. other words: `event_duration` converted from nanoseconds to milliseconds.
  121. [source,esql]
  122. ----
  123. include::{esql-specs}/eval.csv-spec[tag=gs-eval]
  124. ----
  125. `EVAL` supports several <<esql-functions,functions>>. For example, to round a
  126. number to the closest number with the specified number of digits, use the
  127. <<esql-round>> function:
  128. [source,esql]
  129. ----
  130. include::{esql-specs}/eval.csv-spec[tag=gs-round]
  131. ----
  132. [discrete]
  133. [[esql-getting-started-stats]]
  134. === Calculate statistics
  135. {esql} can not only be used to query your data, you can also use it to aggregate
  136. your data. Use the <<esql-stats-by>> command to calculate statistics. For
  137. example, the median duration:
  138. [source,esql]
  139. ----
  140. include::{esql-specs}/stats.csv-spec[tag=gs-stats]
  141. ----
  142. You can calculate multiple stats with one command:
  143. [source,esql]
  144. ----
  145. include::{esql-specs}/stats.csv-spec[tag=gs-two-stats]
  146. ----
  147. Use `BY` to group calculated stats by one or more columns. For example, to
  148. calculate the median duration per client IP:
  149. [source,esql]
  150. ----
  151. include::{esql-specs}/stats.csv-spec[tag=gs-stats-by]
  152. ----
  153. [discrete]
  154. [[esql-getting-started-histogram]]
  155. === Create a histogram
  156. To track statistics over time, {esql} enables you to create histograms using the
  157. <<esql-auto_bucket>> function. `AUTO_BUCKET` creates human-friendly bucket sizes
  158. and returns a value for each row that corresponds to the resulting bucket the
  159. row falls into.
  160. For example, to create hourly buckets for the data on October 23rd:
  161. [source,esql]
  162. ----
  163. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket]
  164. ----
  165. Combine `AUTO_BUCKET` with <<esql-stats-by>> to create a histogram. For example,
  166. to count the number of events per hour:
  167. [source,esql]
  168. ----
  169. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket-stats-by]
  170. ----
  171. Or the median duration per hour:
  172. [source,esql]
  173. ----
  174. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket-stats-by-median]
  175. ----
  176. [discrete]
  177. [[esql-getting-started-enrich]]
  178. === Enrich data
  179. {esql} enables you to <<esql-enrich-data,enrich>> a table with data from indices
  180. in {es}, using the <<esql-enrich>> command.
  181. image::images/esql/esql-enrich.png[align="center"]
  182. Before you can use `ENRICH`, you first need to
  183. <<esql-create-enrich-policy,create>> and <<esql-execute-enrich-policy,execute>>
  184. an <<esql-enrich-policy,enrich policy>>.
  185. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-enrich-policy.asciidoc[]
  186. After creating and executing a policy, you can use it with the `ENRICH`
  187. command:
  188. [source,esql]
  189. ----
  190. include::{esql-specs}/enrich.csv-spec[tag=gs-enrich]
  191. ----
  192. You can use the new `env` column that's added by the `ENRICH` command in
  193. subsequent commands. For example, to calculate the median duration per
  194. environment:
  195. [source,esql]
  196. ----
  197. include::{esql-specs}/enrich.csv-spec[tag=gs-enrich-stats-by]
  198. ----
  199. For more about data enrichment with {esql}, refer to <<esql-enrich-data>>.
  200. [discrete]
  201. [[esql-getting-started-process-data]]
  202. === Process data
  203. Your data may contain unstructured strings that you want to
  204. <<esql-process-data-with-dissect-and-grok,structure>> to make it easier to
  205. analyze the data. For example, the sample data contains log messages like:
  206. [source,txt]
  207. ----
  208. "Connected to 10.1.0.3"
  209. ----
  210. By extracting the IP address from these messages, you can determine which IP has
  211. accepted the most client connections.
  212. To structure unstructured strings at query time, you can use the {esql}
  213. <<esql-dissect>> and <<esql-grok>> commands. `DISSECT` works by breaking up a
  214. string using a delimiter-based pattern. `GROK` works similarly, but uses regular
  215. expressions. This makes `GROK` more powerful, but generally also slower.
  216. In this case, no regular expressions are needed, as the `message` is
  217. straightforward: "Connected to ", followed by the server IP. To match this
  218. string, you can use the following `DISSECT` command:
  219. [source,esql]
  220. ----
  221. include::{esql-specs}/dissect.csv-spec[tag=gs-dissect]
  222. ----
  223. This adds a `server_ip` column to those rows that have a `message` that matches
  224. this pattern. For other rows, the value of `server_ip` is `null`.
  225. You can use the new `server_ip` column that's added by the `DISSECT` command in
  226. subsequent commands. For example, to determine how many connections each server
  227. has accepted:
  228. [source,esql]
  229. ----
  230. include::{esql-specs}/dissect.csv-spec[tag=gs-dissect-stats-by]
  231. ----
  232. For more about data processing with {esql}, refer to
  233. <<esql-process-data-with-dissect-and-grok>>.
  234. [discrete]
  235. [[esql-getting-learn-more]]
  236. === Learn more
  237. To learn more about {esql}, refer to <<esql-language>> and <<esql-using>>.