esql-across-clusters.asciidoc 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549
  1. [[esql-cross-clusters]]
  2. === Using {esql} across clusters
  3. ++++
  4. <titleabbrev>Using {esql} across clusters</titleabbrev>
  5. ++++
  6. [partintro]
  7. [NOTE]
  8. ====
  9. For {ccs-cap} with {esql} on version 8.16 or later, remote clusters must also be on version 8.16 or later.
  10. ====
  11. With {esql}, you can execute a single query across multiple clusters.
  12. [discrete]
  13. [[esql-ccs-prerequisites]]
  14. ==== Prerequisites
  15. include::{es-ref-dir}/search/search-your-data/search-across-clusters.asciidoc[tag=ccs-prereqs]
  16. include::{es-ref-dir}/search/search-your-data/search-across-clusters.asciidoc[tag=ccs-gateway-seed-nodes]
  17. include::{es-ref-dir}/search/search-your-data/search-across-clusters.asciidoc[tag=ccs-proxy-mode]
  18. [discrete]
  19. [[esql-ccs-security-model]]
  20. ==== Security model
  21. {es} supports two security models for cross-cluster search (CCS):
  22. * <<esql-ccs-security-model-certificate, TLS certificate authentication>>
  23. * <<esql-ccs-security-model-api-key, API key authentication>>
  24. [TIP]
  25. ====
  26. To check which security model is being used to connect your clusters, run `GET _remote/info`.
  27. If you're using the API key authentication method, you'll see the `"cluster_credentials"` key in the response.
  28. ====
  29. [discrete]
  30. [[esql-ccs-security-model-certificate]]
  31. ===== TLS certificate authentication
  32. TLS certificate authentication secures remote clusters with mutual TLS.
  33. This could be the preferred model when a single administrator has full control over both clusters.
  34. We generally recommend that roles and their privileges be identical in both clusters.
  35. Refer to <<remote-clusters-cert, TLS certificate authentication>> for prerequisites and detailed setup instructions.
  36. [discrete]
  37. [[esql-ccs-security-model-api-key]]
  38. ===== API key authentication
  39. The following information pertains to using {esql} across clusters with the <<remote-clusters-api-key, *API key based security model*>>. You'll need to follow the steps on that page for the *full setup instructions*. This page only contains additional information specific to {esql}.
  40. API key based cross-cluster search (CCS) enables more granular control over allowed actions between clusters.
  41. This may be the preferred model when you have different administrators for different clusters and want more control over who can access what data. In this model, cluster administrators must explicitly define the access given to clusters and users.
  42. You will need to:
  43. * Create an API key on the *remote cluster* using the <<security-api-create-cross-cluster-api-key,Create cross-cluster API key>> API or using the {kibana-ref}/api-keys.html[Kibana API keys UI].
  44. * Add the API key to the keystore on the *local cluster*, as part of the steps in <<remote-clusters-security-api-key-local-actions,configuring the local cluster>>. All cross-cluster requests from the local cluster are bound by the API key’s privileges.
  45. Using {esql} with the API key based security model requires some additional permissions that may not be needed when using the traditional query DSL based search.
  46. The following example API call creates a role that can query remote indices using {esql} when using the API key based security model.
  47. The final privilege, `remote_cluster`, is required to allow remote enrich operations.
  48. [source,console]
  49. ----
  50. POST /_security/role/remote1
  51. {
  52. "cluster": ["cross_cluster_search"], <1>
  53. "indices": [
  54. {
  55. "names" : [""], <2>
  56. "privileges": ["read"]
  57. }
  58. ],
  59. "remote_indices": [ <3>
  60. {
  61. "names": [ "logs-*" ],
  62. "privileges": [ "read","read_cross_cluster" ], <4>
  63. "clusters" : ["my_remote_cluster"] <5>
  64. }
  65. ],
  66. "remote_cluster": [ <6>
  67. {
  68. "privileges": [
  69. "monitor_enrich"
  70. ],
  71. "clusters": [
  72. "my_remote_cluster"
  73. ]
  74. }
  75. ]
  76. }
  77. ----
  78. <1> The `cross_cluster_search` cluster privilege is required for the _local_ cluster.
  79. <2> Typically, users will have permissions to read both local and remote indices. However, for cases where the role
  80. is intended to ONLY search the remote cluster, the `read` permission is still required for the local cluster.
  81. To provide read access to the local cluster, but disallow reading any indices in the local cluster, the `names`
  82. field may be an empty string.
  83. <3> The indices allowed read access to the remote cluster. The configured
  84. <<security-api-create-cross-cluster-api-key,cross-cluster API key>> must also allow this index to be read.
  85. <4> The `read_cross_cluster` privilege is always required when using {esql} across clusters with the API key based
  86. security model.
  87. <5> The remote clusters to which these privileges apply.
  88. This remote cluster must be configured with a <<security-api-create-cross-cluster-api-key,cross-cluster API key>>
  89. and connected to the remote cluster before the remote index can be queried.
  90. Verify connection using the <<cluster-remote-info, Remote cluster info>> API.
  91. <6> Required to allow remote enrichment. Without this, the user cannot read from the `.enrich` indices on the
  92. remote cluster. The `remote_cluster` security privilege was introduced in version *8.15.0*.
  93. You will then need a user or API key with the permissions you created above. The following example API call creates
  94. a user with the `remote1` role.
  95. [source,console]
  96. ----
  97. POST /_security/user/remote_user
  98. {
  99. "password" : "<PASSWORD>",
  100. "roles" : [ "remote1" ]
  101. }
  102. ----
  103. Remember that all cross-cluster requests from the local cluster are bound by the cross cluster API key’s privileges,
  104. which are controlled by the remote cluster's administrator.
  105. [TIP]
  106. ====
  107. Cross cluster API keys created in versions prior to 8.15.0 will need to replaced or updated to add the new permissions
  108. required for {esql} with ENRICH.
  109. ====
  110. [discrete]
  111. [[ccq-remote-cluster-setup]]
  112. ==== Remote cluster setup
  113. Once the security model is configured, you can add remote clusters.
  114. include::{es-ref-dir}/search/search-your-data/search-across-clusters.asciidoc[tag=ccs-remote-cluster-setup]
  115. <1> Since `skip_unavailable` was not set on `cluster_three`, it uses
  116. the default of `true`. See the <<ccq-skip-unavailable-clusters>>
  117. section for details.
  118. [discrete]
  119. [[ccq-from]]
  120. ==== Query across multiple clusters
  121. In the `FROM` command, specify data streams and indices on remote clusters
  122. using the format `<remote_cluster_name>:<target>`. For instance, the following
  123. {esql} request queries the `my-index-000001` index on a single remote cluster
  124. named `cluster_one`:
  125. [source,esql]
  126. ----
  127. FROM cluster_one:my-index-000001
  128. | LIMIT 10
  129. ----
  130. Similarly, this {esql} request queries the `my-index-000001` index from
  131. three clusters:
  132. * The local ("querying") cluster
  133. * Two remote clusters, `cluster_one` and `cluster_two`
  134. [source,esql]
  135. ----
  136. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
  137. | LIMIT 10
  138. ----
  139. Likewise, this {esql} request queries the `my-index-000001` index from all
  140. remote clusters (`cluster_one`, `cluster_two`, and `cluster_three`):
  141. [source,esql]
  142. ----
  143. FROM *:my-index-000001
  144. | LIMIT 10
  145. ----
  146. [discrete]
  147. [[ccq-cluster-details]]
  148. ==== Cross-cluster metadata
  149. Using the `"include_ccs_metadata": true` option, users can request that
  150. ES|QL {ccs} responses include metadata about the search on each cluster (when the response format is JSON).
  151. Here we show an example using the async search endpoint. {ccs-cap} metadata is also present in the synchronous
  152. search endpoint response when requested.
  153. [source,console]
  154. ----
  155. POST /_query/async?format=json
  156. {
  157. "query": """
  158. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index*
  159. | STATS COUNT(http.response.status_code) BY user.id
  160. | LIMIT 2
  161. """,
  162. "include_ccs_metadata": true
  163. }
  164. ----
  165. // TEST[setup:my_index]
  166. // TEST[s/cluster_one:my-index-000001,cluster_two:my-index//]
  167. Which returns:
  168. [source,console-result]
  169. ----
  170. {
  171. "is_running": false,
  172. "took": 42, <1>
  173. "is_partial": false, <7>
  174. "columns" : [
  175. {
  176. "name" : "COUNT(http.response.status_code)",
  177. "type" : "long"
  178. },
  179. {
  180. "name" : "user.id",
  181. "type" : "keyword"
  182. }
  183. ],
  184. "values" : [
  185. [4, "elkbee"],
  186. [1, "kimchy"]
  187. ],
  188. "_clusters": { <2>
  189. "total": 3,
  190. "successful": 3,
  191. "running": 0,
  192. "skipped": 0,
  193. "partial": 0,
  194. "failed": 0,
  195. "details": { <3>
  196. "(local)": { <4>
  197. "status": "successful",
  198. "indices": "blogs",
  199. "took": 41, <5>
  200. "_shards": { <6>
  201. "total": 13,
  202. "successful": 13,
  203. "skipped": 0,
  204. "failed": 0
  205. }
  206. },
  207. "cluster_one": {
  208. "status": "successful",
  209. "indices": "cluster_one:my-index-000001",
  210. "took": 38,
  211. "_shards": {
  212. "total": 4,
  213. "successful": 4,
  214. "skipped": 0,
  215. "failed": 0
  216. }
  217. },
  218. "cluster_two": {
  219. "status": "successful",
  220. "indices": "cluster_two:my-index*",
  221. "took": 40,
  222. "_shards": {
  223. "total": 18,
  224. "successful": 18,
  225. "skipped": 1,
  226. "failed": 0
  227. }
  228. }
  229. }
  230. }
  231. }
  232. ----
  233. // TEST[skip: cross-cluster testing env not set up]
  234. <1> How long the entire search (across all clusters) took, in milliseconds.
  235. <2> This section of counters shows all possible cluster search states and how many cluster
  236. searches are currently in that state. The clusters can have one of the following statuses: *running*,
  237. *successful* (searches on all shards were successful), *skipped* (the search
  238. failed on a cluster marked with `skip_unavailable`=`true`), *failed* (the search
  239. failed on a cluster marked with `skip_unavailable`=`false`) or **partial** (the search was
  240. <<esql-async-query-stop-api, interrupted>> before finishing or has partially failed).
  241. <3> The `_clusters/details` section shows metadata about the search on each cluster.
  242. <4> If you included indices from the local cluster you sent the request to in your {ccs},
  243. it is identified as "(local)".
  244. <5> How long (in milliseconds) the search took on each cluster. This can be useful to determine
  245. which clusters have slower response times than others.
  246. <6> The shard details for the search on that cluster, including a count of shards that were
  247. skipped due to the can-match phase results. Shards are skipped when they cannot have any matching data
  248. and therefore are not included in the full ES|QL query.
  249. <7> The `is_partial` field is set to `true` if the search has partial results for any reason,
  250. for example if it was interrupted before finishing using the <<esql-async-query-stop-api,async query stop API>>,
  251. or one of the remotes or shards failed.
  252. The cross-cluster metadata can be used to determine whether any data came back from a cluster.
  253. For instance, in the query below, the wildcard expression for `cluster-two` did not resolve
  254. to a concrete index (or indices). The cluster is, therefore, marked as 'skipped' and the total
  255. number of shards searched is set to zero.
  256. [source,console]
  257. ----
  258. POST /_query/async?format=json
  259. {
  260. "query": """
  261. FROM cluster_one:my-index*,cluster_two:logs*
  262. | STATS COUNT(http.response.status_code) BY user.id
  263. | LIMIT 2
  264. """,
  265. "include_ccs_metadata": true
  266. }
  267. ----
  268. // TEST[continued]
  269. // TEST[s/cluster_one:my-index\*,cluster_two:logs\*/my-index-000001/]
  270. Which returns:
  271. [source,console-result]
  272. ----
  273. {
  274. "is_running": false,
  275. "took": 55,
  276. "is_partial": false,
  277. "columns": [
  278. ... // not shown
  279. ],
  280. "values": [
  281. ... // not shown
  282. ],
  283. "_clusters": {
  284. "total": 2,
  285. "successful": 2,
  286. "running": 0,
  287. "skipped": 0,
  288. "partial": 0,
  289. "failed": 0,
  290. "details": {
  291. "cluster_one": {
  292. "status": "successful",
  293. "indices": "cluster_one:my-index*",
  294. "took": 38,
  295. "_shards": {
  296. "total": 4,
  297. "successful": 4,
  298. "skipped": 0,
  299. "failed": 0
  300. }
  301. },
  302. "cluster_two": {
  303. "status": "skipped", <1>
  304. "indices": "cluster_two:logs*",
  305. "took": 0,
  306. "_shards": {
  307. "total": 0, <2>
  308. "successful": 0,
  309. "skipped": 0,
  310. "failed": 0
  311. }
  312. }
  313. }
  314. }
  315. }
  316. ----
  317. // TEST[skip: cross-cluster testing env not set up]
  318. <1> This cluster is marked as 'skipped', since there were no matching indices on that cluster.
  319. <2> Indicates that no shards were searched (due to not having any matching indices).
  320. [discrete]
  321. [[ccq-enrich]]
  322. ==== Enrich across clusters
  323. Enrich in {esql} across clusters operates similarly to <<esql-enrich,local enrich>>.
  324. If the enrich policy and its enrich indices are consistent across all clusters, simply
  325. write the enrich command as you would without remote clusters. In this default mode,
  326. {esql} can execute the enrich command on either the local cluster or the remote
  327. clusters, aiming to minimize computation or inter-cluster data transfer. Ensuring that
  328. the policy exists with consistent data on both the local cluster and the remote
  329. clusters is critical for ES|QL to produce a consistent query result.
  330. [TIP]
  331. ====
  332. Enrich in {esql} across clusters using the API key based security model was introduced in version *8.15.0*.
  333. Cross cluster API keys created in versions prior to 8.15.0 will need to replaced or updated to use the new required permissions.
  334. Refer to the example in the <<esql-ccs-security-model-api-key,API key authentication>> section.
  335. ====
  336. In the following example, the enrich with `hosts` policy can be executed on
  337. either the local cluster or the remote cluster `cluster_one`.
  338. [source,esql]
  339. ----
  340. FROM my-index-000001,cluster_one:my-index-000001
  341. | ENRICH hosts ON ip
  342. | LIMIT 10
  343. ----
  344. Enrich with an {esql} query against remote clusters only can also happen on
  345. the local cluster. This means the below query requires the `hosts` enrich
  346. policy to exist on the local cluster as well.
  347. [source,esql]
  348. ----
  349. FROM cluster_one:my-index-000001,cluster_two:my-index-000001
  350. | LIMIT 10
  351. | ENRICH hosts ON ip
  352. ----
  353. [discrete]
  354. [[esql-enrich-coordinator]]
  355. ===== Enrich with coordinator mode
  356. {esql} provides the enrich `_coordinator` mode to force {esql} to execute the enrich
  357. command on the local cluster. This mode should be used when the enrich policy is
  358. not available on the remote clusters or maintaining consistency of enrich indices
  359. across clusters is challenging.
  360. [source,esql]
  361. ----
  362. FROM my-index-000001,cluster_one:my-index-000001
  363. | ENRICH _coordinator:hosts ON ip
  364. | SORT host_name
  365. | LIMIT 10
  366. ----
  367. [discrete]
  368. [IMPORTANT]
  369. ====
  370. Enrich with the `_coordinator` mode usually increases inter-cluster data transfer and
  371. workload on the local cluster.
  372. ====
  373. [discrete]
  374. [[esql-enrich-remote]]
  375. ===== Enrich with remote mode
  376. {esql} also provides the enrich `_remote` mode to force {esql} to execute the enrich
  377. command independently on each remote cluster where the target indices reside.
  378. This mode is useful for managing different enrich data on each cluster, such as detailed
  379. information of hosts for each region where the target (main) indices contain
  380. log events from these hosts.
  381. In the below example, the `hosts` enrich policy is required to exist on all
  382. remote clusters: the `querying` cluster (as local indices are included),
  383. the remote cluster `cluster_one`, and `cluster_two`.
  384. [source,esql]
  385. ----
  386. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
  387. | ENRICH _remote:hosts ON ip
  388. | SORT host_name
  389. | LIMIT 10
  390. ----
  391. A `_remote` enrich cannot be executed after a <<esql-stats-by,stats>>
  392. command. The following example would result in an error:
  393. [source,esql]
  394. ----
  395. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
  396. | STATS COUNT(*) BY ip
  397. | ENRICH _remote:hosts ON ip
  398. | SORT host_name
  399. | LIMIT 10
  400. ----
  401. [discrete]
  402. [[esql-multi-enrich]]
  403. ===== Multiple enrich commands
  404. You can include multiple enrich commands in the same query with different
  405. modes. {esql} will attempt to execute them accordingly. For example, this
  406. query performs two enriches, first with the `hosts` policy on any cluster
  407. and then with the `vendors` policy on the local cluster.
  408. [source,esql]
  409. ----
  410. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
  411. | ENRICH hosts ON ip
  412. | ENRICH _coordinator:vendors ON os
  413. | LIMIT 10
  414. ----
  415. A `_remote` enrich command can't be executed after a `_coordinator` enrich
  416. command. The following example would result in an error.
  417. [source,esql]
  418. ----
  419. FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001
  420. | ENRICH _coordinator:hosts ON ip
  421. | ENRICH _remote:vendors ON os
  422. | LIMIT 10
  423. ----
  424. [discrete]
  425. [[ccq-exclude]]
  426. ==== Excluding clusters or indices from {esql} query
  427. To exclude an entire cluster, prefix the cluster alias with a minus sign in
  428. the `FROM` command, for example: `-my_cluster:*`:
  429. [source,esql]
  430. ----
  431. FROM my-index-000001,cluster*:my-index-000001,-cluster_three:*
  432. | LIMIT 10
  433. ----
  434. To exclude a specific remote index, prefix the index with a minus sign in
  435. the `FROM` command, such as `my_cluster:-my_index`:
  436. [source,esql]
  437. ----
  438. FROM my-index-000001,cluster*:my-index-*,cluster_three:-my-index-000001
  439. | LIMIT 10
  440. ----
  441. [discrete]
  442. [[ccq-skip-unavailable-clusters]]
  443. ==== Optional remote clusters
  444. If the remote cluster is configured with `skip_unavailable: true` (the default setting), the cluster would be set
  445. to `skipped` or `partial` status but the query will not fail, if:
  446. * The remote cluster is disconnected from the querying cluster, either before or during the query.
  447. * The remote cluster does not have the requested index.
  448. * An error happened while processing the query on the remote cluster.
  449. The `partial` status will be used if the remote query was partially successful and some data was returned.
  450. This however does not apply to the situation when the remote cluster is missing an index and this is the only index in the query,
  451. or all the indices in the query are missing. For example, the following queries will fail:
  452. [source,esql]
  453. ----
  454. FROM cluster_one:missing-index | LIMIT 10
  455. FROM cluster_one:missing-index* | LIMIT 10
  456. FROM cluster_one:missing-index*,cluster_two:missing-index | LIMIT 10
  457. ----
  458. [discrete]
  459. [[ccq-during-upgrade]]
  460. ==== Query across clusters during an upgrade
  461. include::{es-ref-dir}/search/search-your-data/search-across-clusters.asciidoc[tag=ccs-during-upgrade]