1
0

cross-cluster-search.asciidoc 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418
  1. [[modules-cross-cluster-search]]
  2. == Search across clusters
  3. *{ccs-cap}* lets you run a single search request against one or more
  4. <<modules-remote-clusters,remote clusters>>. For example, you can use a {ccs} to
  5. filter and analyze log data stored on clusters in different data centers.
  6. IMPORTANT: {ccs-cap} requires <<modules-remote-clusters, remote clusters>>.
  7. [discrete]
  8. [[ccs-supported-apis]]
  9. === Supported APIs
  10. The following APIs support {ccs}:
  11. * <<search-search,Search>>
  12. * <<search-multi-search,Multi search>>
  13. * <<search-template,Search template>>
  14. * <<multi-search-template,Multi search template>>
  15. [discrete]
  16. [[ccs-example]]
  17. === {ccs-cap} examples
  18. [discrete]
  19. [[ccs-remote-cluster-setup]]
  20. ==== Remote cluster setup
  21. To perform a {ccs}, you must have at least one remote cluster configured.
  22. The following <<cluster-update-settings,cluster update settings>> API request
  23. adds three remote clusters:`cluster_one`, `cluster_two`, and `cluster_three`.
  24. [source,console]
  25. --------------------------------
  26. PUT _cluster/settings
  27. {
  28. "persistent": {
  29. "cluster": {
  30. "remote": {
  31. "cluster_one": {
  32. "seeds": [
  33. "127.0.0.1:9300"
  34. ]
  35. },
  36. "cluster_two": {
  37. "seeds": [
  38. "127.0.0.1:9301"
  39. ]
  40. },
  41. "cluster_three": {
  42. "seeds": [
  43. "127.0.0.1:9302"
  44. ]
  45. }
  46. }
  47. }
  48. }
  49. }
  50. --------------------------------
  51. // TEST[setup:host]
  52. // TEST[s/127.0.0.1:930\d+/\${transport_host}/]
  53. [discrete]
  54. [[ccs-search-remote-cluster]]
  55. ==== Search a single remote cluster
  56. The following <<search-search,search>> API request searches the
  57. `my-index-000001` index on a single remote cluster, `cluster_one`.
  58. [source,console]
  59. --------------------------------------------------
  60. GET /cluster_one:my-index-000001/_search
  61. {
  62. "query": {
  63. "match": {
  64. "user.id": "kimchy"
  65. }
  66. },
  67. "_source": ["user.id", "message", "http.response.status_code"]
  68. }
  69. --------------------------------------------------
  70. // TEST[continued]
  71. // TEST[setup:my_index]
  72. The API returns the following response:
  73. [source,console-result]
  74. --------------------------------------------------
  75. {
  76. "took": 150,
  77. "timed_out": false,
  78. "_shards": {
  79. "total": 1,
  80. "successful": 1,
  81. "failed": 0,
  82. "skipped": 0
  83. },
  84. "_clusters": {
  85. "total": 1,
  86. "successful": 1,
  87. "skipped": 0
  88. },
  89. "hits": {
  90. "total" : {
  91. "value": 1,
  92. "relation": "eq"
  93. },
  94. "max_score": 1,
  95. "hits": [
  96. {
  97. "_index": "cluster_one:my-index-000001", <1>
  98. "_id": "0",
  99. "_score": 1,
  100. "_source": {
  101. "user": {
  102. "id": "kimchy"
  103. },
  104. "message": "GET /search HTTP/1.1 200 1070000",
  105. "http": {
  106. "response":
  107. {
  108. "status_code": 200
  109. }
  110. }
  111. }
  112. }
  113. ]
  114. }
  115. }
  116. --------------------------------------------------
  117. // TESTRESPONSE[s/"took": 150/"took": "$body.took"/]
  118. // TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/]
  119. // TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/]
  120. <1> The search response body includes the name of the remote cluster in the
  121. `_index` parameter.
  122. [discrete]
  123. [[ccs-search-multi-remote-cluster]]
  124. ==== Search multiple remote clusters
  125. The following <<search,search>> API request searches the `my-index-000001` index on
  126. three clusters:
  127. * Your local cluster
  128. * Two remote clusters, `cluster_one` and `cluster_two`
  129. [source,console]
  130. --------------------------------------------------
  131. GET /my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001/_search
  132. {
  133. "query": {
  134. "match": {
  135. "user.id": "kimchy"
  136. }
  137. },
  138. "_source": ["user.id", "message", "http.response.status_code"]
  139. }
  140. --------------------------------------------------
  141. // TEST[continued]
  142. The API returns the following response:
  143. [source,console-result]
  144. --------------------------------------------------
  145. {
  146. "took": 150,
  147. "timed_out": false,
  148. "num_reduce_phases": 4,
  149. "_shards": {
  150. "total": 3,
  151. "successful": 3,
  152. "failed": 0,
  153. "skipped": 0
  154. },
  155. "_clusters": {
  156. "total": 3,
  157. "successful": 3,
  158. "skipped": 0
  159. },
  160. "hits": {
  161. "total" : {
  162. "value": 3,
  163. "relation": "eq"
  164. },
  165. "max_score": 1,
  166. "hits": [
  167. {
  168. "_index": "my-index-000001", <1>
  169. "_id": "0",
  170. "_score": 2,
  171. "_source": {
  172. "user": {
  173. "id": "kimchy"
  174. },
  175. "message": "GET /search HTTP/1.1 200 1070000",
  176. "http": {
  177. "response":
  178. {
  179. "status_code": 200
  180. }
  181. }
  182. }
  183. },
  184. {
  185. "_index": "cluster_one:my-index-000001", <2>
  186. "_id": "0",
  187. "_score": 1,
  188. "_source": {
  189. "user": {
  190. "id": "kimchy"
  191. },
  192. "message": "GET /search HTTP/1.1 200 1070000",
  193. "http": {
  194. "response":
  195. {
  196. "status_code": 200
  197. }
  198. }
  199. }
  200. },
  201. {
  202. "_index": "cluster_two:my-index-000001", <3>
  203. "_id": "0",
  204. "_score": 1,
  205. "_source": {
  206. "user": {
  207. "id": "kimchy"
  208. },
  209. "message": "GET /search HTTP/1.1 200 1070000",
  210. "http": {
  211. "response":
  212. {
  213. "status_code": 200
  214. }
  215. }
  216. }
  217. }
  218. ]
  219. }
  220. }
  221. --------------------------------------------------
  222. // TESTRESPONSE[s/"took": 150/"took": "$body.took"/]
  223. // TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/]
  224. // TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/]
  225. // TESTRESPONSE[s/"_score": 2/"_score": "$body.hits.hits.1._score"/]
  226. <1> This document's `_index` parameter doesn't include a cluster name. This
  227. means the document came from the local cluster.
  228. <2> This document came from `cluster_one`.
  229. <3> This document came from `cluster_two`.
  230. [discrete]
  231. [[skip-unavailable-clusters]]
  232. === Skip unavailable clusters
  233. By default, a {ccs} returns an error if *any* cluster in the request is
  234. unavailable.
  235. To skip an unavailable cluster during a {ccs}, set the
  236. <<skip-unavailable,`skip_unavailable`>> cluster setting to `true`.
  237. The following <<cluster-update-settings,cluster update settings>> API request
  238. changes `cluster_two`'s `skip_unavailable` setting to `true`.
  239. [source,console]
  240. --------------------------------
  241. PUT _cluster/settings
  242. {
  243. "persistent": {
  244. "cluster.remote.cluster_two.skip_unavailable": true
  245. }
  246. }
  247. --------------------------------
  248. // TEST[continued]
  249. If `cluster_two` is disconnected or unavailable during a {ccs}, {es} won't
  250. include matching documents from that cluster in the final results.
  251. [discrete]
  252. [[ccs-gateway-seed-nodes]]
  253. === Selecting gateway and seed nodes in sniff mode
  254. For remote clusters using the <<sniff-mode,sniff connection>> mode, gateway and
  255. seed nodes need to be accessible from the local cluster via your network.
  256. By default, any non-<<master-node,master-eligible>> node can act as a
  257. gateway node. If wanted, you can define the gateway nodes for a cluster by
  258. setting `cluster.remote.node.attr.gateway` to `true`.
  259. For {ccs}, we recommend you use gateway nodes that are capable of serving as
  260. <<coordinating-node,coordinating nodes>> for search requests. If
  261. wanted, the seed nodes for a cluster can be a subset of these gateway nodes.
  262. [discrete]
  263. [[ccs-proxy-mode]]
  264. === {ccs-cap} in proxy mode
  265. <<proxy-mode,Proxy mode>> remote cluster connections support {ccs}. All remote
  266. connections connect to the configured `proxy_address`. Any desired connection
  267. routing to gateway or <<coordinating-node,coordinating nodes>> must
  268. be implemented by the intermediate proxy at this configured address.
  269. [discrete]
  270. [[ccs-network-delays]]
  271. === How {ccs} handles network delays
  272. Because {ccs} involves sending requests to remote clusters, any network delays
  273. can impact search speed. To avoid slow searches, {ccs} offers two options for
  274. handling network delays:
  275. <<ccs-min-roundtrips,Minimize network roundtrips>>::
  276. By default, {es} reduces the number of network roundtrips between remote
  277. clusters. This reduces the impact of network delays on search speed. However,
  278. {es} can't reduce network roundtrips for large search requests, such as those
  279. including a <<scroll-search-results, scroll>> or
  280. <<request-body-search-inner-hits,inner hits>>.
  281. +
  282. See <<ccs-min-roundtrips>> to learn how this option works.
  283. <<ccs-unmin-roundtrips, Don't minimize network roundtrips>>:: For search
  284. requests that include a scroll or inner hits, {es} sends multiple outgoing and
  285. ingoing requests to each remote cluster. You can also choose this option by
  286. setting the <<ccs-minimize-roundtrips,`ccs_minimize_roundtrips`>> parameter to
  287. `false`. While typically slower, this approach may work well for networks with
  288. low latency.
  289. +
  290. See <<ccs-unmin-roundtrips>> to learn how this option works.
  291. [discrete]
  292. [[ccs-min-roundtrips]]
  293. ==== Minimize network roundtrips
  294. Here's how {ccs} works when you minimize network roundtrips.
  295. . You send a {ccs} request to your local cluster. A coordinating node in that
  296. cluster receives and parses the request.
  297. +
  298. image:images/ccs/ccs-min-roundtrip-client-request.svg[]
  299. . The coordinating node sends a single search request to each cluster, including
  300. the local cluster. Each cluster performs the search request independently,
  301. applying its own cluster-level settings to the request.
  302. +
  303. image:images/ccs/ccs-min-roundtrip-cluster-search.svg[]
  304. . Each remote cluster sends its search results back to the coordinating node.
  305. +
  306. image:images/ccs/ccs-min-roundtrip-cluster-results.svg[]
  307. . After collecting results from each cluster, the coordinating node returns the
  308. final results in the {ccs} response.
  309. +
  310. image:images/ccs/ccs-min-roundtrip-client-response.svg[]
  311. [discrete]
  312. [[ccs-unmin-roundtrips]]
  313. ==== Don't minimize network roundtrips
  314. Here's how {ccs} works when you don't minimize network roundtrips.
  315. . You send a {ccs} request to your local cluster. A coordinating node in that
  316. cluster receives and parses the request.
  317. +
  318. image:images/ccs/ccs-min-roundtrip-client-request.svg[]
  319. . The coordinating node sends a <<search-shards,search shards>> API request to
  320. each remote cluster.
  321. +
  322. image:images/ccs/ccs-min-roundtrip-cluster-search.svg[]
  323. . Each remote cluster sends its response back to the coordinating node.
  324. This response contains information about the indices and shards the {ccs}
  325. request will be executed on.
  326. +
  327. image:images/ccs/ccs-min-roundtrip-cluster-results.svg[]
  328. . The coordinating node sends a search request to each shard, including those in
  329. its own cluster. Each shard performs the search request independently.
  330. +
  331. [WARNING]
  332. ====
  333. When network roundtrips aren't minimized, the search is executed as if all data
  334. were in the coordinating node's cluster. We recommend updating cluster-level
  335. settings that limit searches, such as `action.search.shard_count.limit`,
  336. `pre_filter_shard_size`, and `max_concurrent_shard_requests`, to account for
  337. this. If these limits are too low, the search may be rejected.
  338. ====
  339. +
  340. image:images/ccs/ccs-dont-min-roundtrip-shard-search.svg[]
  341. . Each shard sends its search results back to the coordinating node.
  342. +
  343. image:images/ccs/ccs-dont-min-roundtrip-shard-results.svg[]
  344. . After collecting results from each cluster, the coordinating node returns the
  345. final results in the {ccs} response.
  346. +
  347. image:images/ccs/ccs-min-roundtrip-client-response.svg[]
  348. [discrete]
  349. [[ccs-supported-configurations]]
  350. === Supported configurations
  351. Generally, <<gateway-nodes-selection, cross cluster search>> can search remote
  352. clusters that are one major version ahead or behind the coordinating node's
  353. version. Cross cluster search can also search remote clusters that are being
  354. <<rolling-upgrades, upgraded>> so long as both the "upgrade from" and
  355. "upgrade to" version are compatible with the gateway node.
  356. For example, a coordinating node running {es} 5.6 can search a remote cluster
  357. running {es} 6.8, but that cluster can not be upgraded to 7.1. In this case
  358. you should first upgrade the coordinating node to 7.1 and then upgrade remote
  359. cluster.
  360. WARNING: Running multiple versions of {es} in the same cluster beyond the
  361. duration of an upgrade is not supported.