esql-get-started.asciidoc 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347
  1. [[esql-getting-started]]
  2. == Getting started with {esql} queries
  3. ++++
  4. <titleabbrev>Getting started</titleabbrev>
  5. ++++
  6. preview::["Do not use {esql} on production environments. This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features."]
  7. This guide shows how you can use {esql} to query and aggregate your data.
  8. [discrete]
  9. [[esql-getting-started-prerequisites]]
  10. === Prerequisites
  11. To follow along with the queries in this guide, you can either set up your own
  12. deployment, or use Elastic's public {esql} demo environment.
  13. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-sample-data.asciidoc[]
  14. [discrete]
  15. [[esql-getting-started-running-queries]]
  16. === Run an {esql} query
  17. In {kib}, you can use Console or Discover to run {esql} queries:
  18. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-discover-console.asciidoc[]
  19. [discrete]
  20. [[esql-getting-started-first-query]]
  21. === Your first {esql} query
  22. Each {esql} query starts with a <<esql-source-commands,source command>>. A
  23. source command produces a table, typically with data from {es}.
  24. image::images/esql/source-command.svg[A source command producing a table from {es},align="center"]
  25. The <<esql-from>> source command returns a table with documents from a data
  26. stream, index, or alias. Each row in the resulting table represents a document.
  27. This query returns up to 500 documents from the `sample_data` index:
  28. [source,esql]
  29. ----
  30. include::{esql-specs}/docs.csv-spec[tag=gs-from]
  31. ----
  32. Each column corresponds to a field, and can be accessed by the name of that
  33. field.
  34. [TIP]
  35. ====
  36. {esql} keywords are case-insensitive. The following query is identical to the
  37. previous one:
  38. [source,esql]
  39. ----
  40. include::{esql-specs}/docs.csv-spec[tag=gs-from-lowercase]
  41. ----
  42. ====
  43. [discrete]
  44. [[esql-getting-started-limit]]
  45. === Processing commands
  46. A source command can be followed by one or more
  47. <<esql-processing-commands,processing commands>>, separated by a pipe character:
  48. `|`. Processing commands change an input table by adding, removing, or changing
  49. rows and columns. Processing commands can perform filtering, projection,
  50. aggregation, and more.
  51. image::images/esql/esql-limit.png[A processing command changing an input table,align="center",width="60%"]
  52. For example, you can use the <<esql-limit>> command to limit the number of rows
  53. that are returned, up to a maximum of 10,000 rows:
  54. [source,esql]
  55. ----
  56. include::{esql-specs}/docs.csv-spec[tag=gs-limit]
  57. ----
  58. [TIP]
  59. ====
  60. For readability, you can put each command on a separate line. However, you don't
  61. have to. The following query is identical to the previous one:
  62. [source,esql]
  63. ----
  64. include::{esql-specs}/docs.csv-spec[tag=gs-limit-one-line]
  65. ----
  66. ====
  67. [discrete]
  68. [[esql-getting-started-sort]]
  69. ==== Sort a table
  70. image::images/esql/esql-sort.png[A processing command sorting an input table,align="center",width="60%"]
  71. Another processing command is the <<esql-sort>> command. By default, the rows
  72. returned by `FROM` don't have a defined sort order. Use the `SORT` command to
  73. sort rows on one or more columns:
  74. [source,esql]
  75. ----
  76. include::{esql-specs}/docs.csv-spec[tag=gs-sort]
  77. ----
  78. [discrete]
  79. [[esql-getting-started-where]]
  80. ==== Query the data
  81. Use the <<esql-where>> command to query the data. For example, to find all
  82. events with a duration longer than 5ms:
  83. [source,esql]
  84. ----
  85. include::{esql-specs}/where.csv-spec[tag=gs-where]
  86. ----
  87. `WHERE` supports several <<esql-operators,operators>>. For example, you can use <<esql-like-operator>> to run a wildcard query against the `message` column:
  88. [source,esql]
  89. ----
  90. include::{esql-specs}/where-like.csv-spec[tag=gs-like]
  91. ----
  92. [discrete]
  93. [[esql-getting-started-more-commands]]
  94. ==== More processing commands
  95. There are many other processing commands, like <<esql-keep>> and <<esql-drop>>
  96. to keep or drop columns, <<esql-enrich>> to enrich a table with data from
  97. indices in {es}, and <<esql-dissect>> and <<esql-grok>> to process data. Refer
  98. to <<esql-processing-commands>> for an overview of all processing commands.
  99. [discrete]
  100. [[esql-getting-started-chaining]]
  101. === Chain processing commands
  102. You can chain processing commands, separated by a pipe character: `|`. Each
  103. processing command works on the output table of the previous command. The result
  104. of a query is the table produced by the final processing command.
  105. image::images/esql/esql-sort-limit.png[Processing commands can be chained,align="center"]
  106. The following example first sorts the table on `@timestamp`, and next limits the
  107. result set to 3 rows:
  108. [source,esql]
  109. ----
  110. include::{esql-specs}/docs.csv-spec[tag=gs-chaining]
  111. ----
  112. NOTE: The order of processing commands is important. First limiting the result
  113. set to 3 rows before sorting those 3 rows would most likely return a result that
  114. is different than this example, where the sorting comes before the limit.
  115. [discrete]
  116. [[esql-getting-started-eval]]
  117. === Compute values
  118. Use the <<esql-eval>> command to append columns to a table, with calculated
  119. values. For example, the following query appends a `duration_ms` column. The
  120. values in the column are computed by dividing `event_duration` by 1,000,000. In
  121. other words: `event_duration` converted from nanoseconds to milliseconds.
  122. [source,esql]
  123. ----
  124. include::{esql-specs}/eval.csv-spec[tag=gs-eval]
  125. ----
  126. `EVAL` supports several <<esql-functions,functions>>. For example, to round a
  127. number to the closest number with the specified number of digits, use the
  128. <<esql-round>> function:
  129. [source,esql]
  130. ----
  131. include::{esql-specs}/eval.csv-spec[tag=gs-round]
  132. ----
  133. [discrete]
  134. [[esql-getting-started-stats]]
  135. === Calculate statistics
  136. {esql} can not only be used to query your data, you can also use it to aggregate
  137. your data. Use the <<esql-stats-by>> command to calculate statistics. For
  138. example, the median duration:
  139. [source,esql]
  140. ----
  141. include::{esql-specs}/stats.csv-spec[tag=gs-stats]
  142. ----
  143. You can calculate multiple stats with one command:
  144. [source,esql]
  145. ----
  146. include::{esql-specs}/stats.csv-spec[tag=gs-two-stats]
  147. ----
  148. Use `BY` to group calculated stats by one or more columns. For example, to
  149. calculate the median duration per client IP:
  150. [source,esql]
  151. ----
  152. include::{esql-specs}/stats.csv-spec[tag=gs-stats-by]
  153. ----
  154. [discrete]
  155. [[esql-getting-started-access-columns]]
  156. === Access columns
  157. You can access columns by their name. If a name contains special characters,
  158. <<esql-identifiers,it needs to be quoted>> with backticks (+{backtick}+).
  159. Assigning an explicit name to a column created by `EVAL` or `STATS` is optional.
  160. If you don't provide a name, the new column name is equal to the function
  161. expression. For example:
  162. [source,esql]
  163. ----
  164. include::{esql-specs}/eval.csv-spec[tag=gs-eval-no-column-name]
  165. ----
  166. In this query, `EVAL` adds a new column named `event_duration/1000000.0`.
  167. Because its name contains special characters, to access this column, quote it
  168. with backticks:
  169. [source,esql]
  170. ----
  171. include::{esql-specs}/eval.csv-spec[tag=gs-eval-stats-backticks]
  172. ----
  173. [discrete]
  174. [[esql-getting-started-histogram]]
  175. === Create a histogram
  176. To track statistics over time, {esql} enables you to create histograms using the
  177. <<esql-auto_bucket>> function. `AUTO_BUCKET` creates human-friendly bucket sizes
  178. and returns a value for each row that corresponds to the resulting bucket the
  179. row falls into.
  180. For example, to create hourly buckets for the data on October 23rd:
  181. [source,esql]
  182. ----
  183. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket]
  184. ----
  185. Combine `AUTO_BUCKET` with <<esql-stats-by>> to create a histogram. For example,
  186. to count the number of events per hour:
  187. [source,esql]
  188. ----
  189. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket-stats-by]
  190. ----
  191. Or the median duration per hour:
  192. [source,esql]
  193. ----
  194. include::{esql-specs}/date.csv-spec[tag=gs-auto_bucket-stats-by-median]
  195. ----
  196. [discrete]
  197. [[esql-getting-started-enrich]]
  198. === Enrich data
  199. {esql} enables you to <<esql-enrich-data,enrich>> a table with data from indices
  200. in {es}, using the <<esql-enrich>> command.
  201. image::images/esql/esql-enrich.png[align="center"]
  202. Before you can use `ENRICH`, you first need to
  203. <<esql-create-enrich-policy,create>> and <<esql-execute-enrich-policy,execute>>
  204. an <<esql-enrich-policy,enrich policy>>.
  205. include::{es-repo-dir}/tab-widgets/esql/esql-getting-started-widget-enrich-policy.asciidoc[]
  206. After creating and executing a policy, you can use it with the `ENRICH`
  207. command:
  208. [source,esql]
  209. ----
  210. include::{esql-specs}/enrich.csv-spec[tag=gs-enrich]
  211. ----
  212. You can use the new `env` column that's added by the `ENRICH` command in
  213. subsequent commands. For example, to calculate the median duration per
  214. environment:
  215. [source,esql]
  216. ----
  217. include::{esql-specs}/enrich.csv-spec[tag=gs-enrich-stats-by]
  218. ----
  219. For more about data enrichment with {esql}, refer to <<esql-enrich-data>>.
  220. [discrete]
  221. [[esql-getting-started-process-data]]
  222. === Process data
  223. Your data may contain unstructured strings that you want to
  224. <<esql-process-data-with-dissect-and-grok,structure>> to make it easier to
  225. analyze the data. For example, the sample data contains log messages like:
  226. [source,txt]
  227. ----
  228. "Connected to 10.1.0.3"
  229. ----
  230. By extracting the IP address from these messages, you can determine which IP has
  231. accepted the most client connections.
  232. To structure unstructured strings at query time, you can use the {esql}
  233. <<esql-dissect>> and <<esql-grok>> commands. `DISSECT` works by breaking up a
  234. string using a delimiter-based pattern. `GROK` works similarly, but uses regular
  235. expressions. This makes `GROK` more powerful, but generally also slower.
  236. In this case, no regular expressions are needed, as the `message` is
  237. straightforward: "Connected to ", followed by the server IP. To match this
  238. string, you can use the following `DISSECT` command:
  239. [source,esql]
  240. ----
  241. include::{esql-specs}/dissect.csv-spec[tag=gs-dissect]
  242. ----
  243. This adds a `server_ip` column to those rows that have a `message` that matches
  244. this pattern. For other rows, the value of `server_ip` is `null`.
  245. You can use the new `server_ip` column that's added by the `DISSECT` command in
  246. subsequent commands. For example, to determine how many connections each server
  247. has accepted:
  248. [source,esql]
  249. ----
  250. include::{esql-specs}/dissect.csv-spec[tag=gs-dissect-stats-by]
  251. ----
  252. For more about data processing with {esql}, refer to
  253. <<esql-process-data-with-dissect-and-grok>>.
  254. [discrete]
  255. [[esql-getting-learn-more]]
  256. === Learn more
  257. To learn more about {esql}, refer to <<esql-language>> and <<esql-using>>.