|
@@ -0,0 +1,175 @@
|
|
|
+[[esql-multi-index]]
|
|
|
+=== Using {esql} to query multiple indices
|
|
|
+++++
|
|
|
+<titleabbrev>Using {esql} to query multiple indices</titleabbrev>
|
|
|
+++++
|
|
|
+
|
|
|
+With {esql}, you can execute a single query across multiple indices, data streams, or aliases.
|
|
|
+To do so, use wildcards and date arithmetic. The following example uses a comma-separated list and a wildcard:
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+FROM employees-00001,other-employees-*
|
|
|
+----
|
|
|
+
|
|
|
+Use the format `<remote_cluster_name>:<target>` to <<esql-cross-clusters, query data streams and indices
|
|
|
+on remote clusters>>:
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+FROM cluster_one:employees-00001,cluster_two:other-employees-*
|
|
|
+----
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[esql-multi-index-invalid-mapping]]
|
|
|
+=== Field type mismatches
|
|
|
+
|
|
|
+When querying multiple indices, data streams, or aliases, you might find that the same field is mapped to multiple different types.
|
|
|
+For example, consider the two indices with the following field mappings:
|
|
|
+
|
|
|
+*index: events_ip*
|
|
|
+```
|
|
|
+{
|
|
|
+ "mappings": {
|
|
|
+ "properties": {
|
|
|
+ "@timestamp": { "type": "date" },
|
|
|
+ "client_ip": { "type": "ip" },
|
|
|
+ "event_duration": { "type": "long" },
|
|
|
+ "message": { "type": "keyword" }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+*index: events_keyword*
|
|
|
+```
|
|
|
+{
|
|
|
+ "mappings": {
|
|
|
+ "properties": {
|
|
|
+ "@timestamp": { "type": "date" },
|
|
|
+ "client_ip": { "type": "keyword" },
|
|
|
+ "event_duration": { "type": "long" },
|
|
|
+ "message": { "type": "keyword" }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+When you query each of these individually with a simple query like `FROM events_ip`, the results are provided with type-specific columns:
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+FROM events_ip
|
|
|
+| SORT @timestamp DESC
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+@timestamp:date | client_ip:ip | event_duration:long | message:keyword
|
|
|
+2023-10-23T13:55:01.543Z | 172.21.3.15 | 1756467 | Connected to 10.1.0.1
|
|
|
+2023-10-23T13:53:55.832Z | 172.21.3.15 | 5033755 | Connection error
|
|
|
+2023-10-23T13:52:55.015Z | 172.21.3.15 | 8268153 | Connection error
|
|
|
+|===
|
|
|
+
|
|
|
+Note how the `client_ip` column is correctly identified as type `ip`, and all values are displayed.
|
|
|
+However, if instead the query sources two conflicting indices with `FROM events_*`, the type of the `client_ip` column cannot be determined
|
|
|
+and is reported as `unsupported` with all values returned as `null`.
|
|
|
+
|
|
|
+[[query-unsupported]]
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+FROM events_*
|
|
|
+| SORT @timestamp DESC
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+@timestamp:date | client_ip:unsupported | event_duration:long | message:keyword
|
|
|
+2023-10-23T13:55:01.543Z | null | 1756467 | Connected to 10.1.0.1
|
|
|
+2023-10-23T13:53:55.832Z | null | 5033755 | Connection error
|
|
|
+2023-10-23T13:52:55.015Z | null | 8268153 | Connection error
|
|
|
+2023-10-23T13:51:54.732Z | null | 725448 | Connection error
|
|
|
+2023-10-23T13:33:34.937Z | null | 1232382 | Disconnected
|
|
|
+2023-10-23T12:27:28.948Z | null | 2764889 | Connected to 10.1.0.2
|
|
|
+2023-10-23T12:15:03.360Z | null | 3450233 | Connected to 10.1.0.3
|
|
|
+|===
|
|
|
+
|
|
|
+In addition, if the query refers to this unsupported field directly, the query fails:
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+FROM events_*
|
|
|
+| KEEP @timestamp, client_ip, event_duration, message
|
|
|
+| SORT @timestamp DESC
|
|
|
+----
|
|
|
+
|
|
|
+[source,bash]
|
|
|
+----
|
|
|
+Cannot use field [client_ip] due to ambiguities being mapped as
|
|
|
+[2] incompatible types:
|
|
|
+ [ip] in [events_ip],
|
|
|
+ [keyword] in [events_keyword]
|
|
|
+----
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[esql-multi-index-union-types]]
|
|
|
+=== Union types
|
|
|
+
|
|
|
+{esql} has a way to handle <<esql-multi-index-invalid-mapping, field type mismatches>>. When the same field is mapped to multiple types in multiple indices,
|
|
|
+the type of the field is understood to be a _union_ of the various types in the index mappings.
|
|
|
+As seen in the preceding examples, this _union type_ cannot be used in the results,
|
|
|
+and cannot be referred to by the query
|
|
|
+-- except when it's passed to a type conversion function that accepts all the types in the _union_ and converts the field
|
|
|
+to a single type. {esql} offers a suite of <<esql-type-conversion-functions,type conversion functions>> to achieve this.
|
|
|
+
|
|
|
+In the above examples, the query can use a command like `EVAL client_ip = TO_IP(client_ip)` to resolve
|
|
|
+the union of `ip` and `keyword` to just `ip`.
|
|
|
+You can also use the type-conversion syntax `EVAL client_ip = client_ip::IP`.
|
|
|
+Alternatively, the query could use <<esql-to_string,`TO_STRING`>> to convert all supported types into `KEYWORD`.
|
|
|
+
|
|
|
+For example, the <<query-unsupported,query>> that returned `client_ip:unsupported` with `null` values can be improved using the `TO_IP` function or the equivalent `field::ip` syntax.
|
|
|
+These changes also resolve the error message.
|
|
|
+As long as the only reference to the original field is to pass it to a conversion function that resolves the type ambiguity, no error results.
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+FROM events_*
|
|
|
+| EVAL client_ip = TO_IP(client_ip)
|
|
|
+| KEEP @timestamp, client_ip, event_duration, message
|
|
|
+| SORT @timestamp DESC
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+@timestamp:date | client_ip:ip | event_duration:long | message:keyword
|
|
|
+2023-10-23T13:55:01.543Z | 172.21.3.15 | 1756467 | Connected to 10.1.0.1
|
|
|
+2023-10-23T13:53:55.832Z | 172.21.3.15 | 5033755 | Connection error
|
|
|
+2023-10-23T13:52:55.015Z | 172.21.3.15 | 8268153 | Connection error
|
|
|
+2023-10-23T13:51:54.732Z | 172.21.3.15 | 725448 | Connection error
|
|
|
+2023-10-23T13:33:34.937Z | 172.21.0.5 | 1232382 | Disconnected
|
|
|
+2023-10-23T12:27:28.948Z | 172.21.2.113 | 2764889 | Connected to 10.1.0.2
|
|
|
+2023-10-23T12:15:03.360Z | 172.21.2.162 | 3450233 | Connected to 10.1.0.3
|
|
|
+|===
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[esql-multi-index-index-metadata]]
|
|
|
+=== Index metadata
|
|
|
+
|
|
|
+It can be helpful to know the particular index from which each row is sourced.
|
|
|
+To get this information, use the <<esql-metadata-fields,`METADATA`>> option on the <<esql-from,`FROM`>> command.
|
|
|
+
|
|
|
+[source.merge.styled,esql]
|
|
|
+----
|
|
|
+FROM events_* METADATA _index
|
|
|
+| EVAL client_ip = TO_IP(client_ip)
|
|
|
+| KEEP _index, @timestamp, client_ip, event_duration, message
|
|
|
+| SORT @timestamp DESC
|
|
|
+----
|
|
|
+[%header.monospaced.styled,format=dsv,separator=|]
|
|
|
+|===
|
|
|
+_index:keyword | @timestamp:date | client_ip:ip | event_duration:long | message:keyword
|
|
|
+events_ip | 2023-10-23T13:55:01.543Z | 172.21.3.15 | 1756467 | Connected to 10.1.0.1
|
|
|
+events_ip | 2023-10-23T13:53:55.832Z | 172.21.3.15 | 5033755 | Connection error
|
|
|
+events_ip | 2023-10-23T13:52:55.015Z | 172.21.3.15 | 8268153 | Connection error
|
|
|
+events_keyword | 2023-10-23T13:51:54.732Z | 172.21.3.15 | 725448 | Connection error
|
|
|
+events_keyword | 2023-10-23T13:33:34.937Z | 172.21.0.5 | 1232382 | Disconnected
|
|
|
+events_keyword | 2023-10-23T12:27:28.948Z | 172.21.2.113 | 2764889 | Connected to 10.1.0.2
|
|
|
+events_keyword | 2023-10-23T12:15:03.360Z | 172.21.2.162 | 3450233 | Connected to 10.1.0.3
|
|
|
+|===
|