search.asciidoc 6.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184
  1. [[search]]
  2. == Search APIs
  3. Most search APIs are <<search-multi-index,multi-index>>, with the
  4. exception of the <<search-explain>> endpoints.
  5. [float]
  6. [[search-routing]]
  7. === Routing
  8. When executing a search, Elasticsearch will pick the "best" copy of the data
  9. based on the <<search-adaptive-replica,adaptive replica selection>> formula.
  10. Which shards will be searched on can also be controlled by providing the
  11. `routing` parameter. For example, when indexing tweets, the routing value can be
  12. the user name:
  13. [source,js]
  14. --------------------------------------------------
  15. POST /twitter/_doc?routing=kimchy
  16. {
  17. "user" : "kimchy",
  18. "postDate" : "2009-11-15T14:12:12",
  19. "message" : "trying out Elasticsearch"
  20. }
  21. --------------------------------------------------
  22. // CONSOLE
  23. In such a case, if we want to search only on the tweets for a specific
  24. user, we can specify it as the routing, resulting in the search hitting
  25. only the relevant shard:
  26. [source,js]
  27. --------------------------------------------------
  28. POST /twitter/_search?routing=kimchy
  29. {
  30. "query": {
  31. "bool" : {
  32. "must" : {
  33. "query_string" : {
  34. "query" : "some query string here"
  35. }
  36. },
  37. "filter" : {
  38. "term" : { "user" : "kimchy" }
  39. }
  40. }
  41. }
  42. }
  43. --------------------------------------------------
  44. // CONSOLE
  45. // TEST[continued]
  46. The routing parameter can be multi valued represented as a comma
  47. separated string. This will result in hitting the relevant shards where
  48. the routing values match to.
  49. [float]
  50. [[search-adaptive-replica]]
  51. === Adaptive Replica Selection
  52. By default, Elasticsearch will use what is called adaptive replica selection.
  53. This allows the coordinating node to send the request to the copy deemed "best"
  54. based on a number of criteria:
  55. - Response time of past requests between the coordinating node and the node
  56. containing the copy of the data
  57. - Time past search requests took to execute on the node containing the data
  58. - The queue size of the search threadpool on the node containing the data
  59. This can be turned off by changing the dynamic cluster setting
  60. `cluster.routing.use_adaptive_replica_selection` from `true` to `false`:
  61. [source,js]
  62. --------------------------------------------------
  63. PUT /_cluster/settings
  64. {
  65. "transient": {
  66. "cluster.routing.use_adaptive_replica_selection": false
  67. }
  68. }
  69. --------------------------------------------------
  70. // CONSOLE
  71. If adaptive replica selection is turned off, searches are sent to the
  72. index/indices shards in a round robin fashion between all copies of the data
  73. (primaries and replicas).
  74. [float]
  75. [[stats-groups]]
  76. === Stats Groups
  77. A search can be associated with stats groups, which maintains a
  78. statistics aggregation per group. It can later be retrieved using the
  79. <<indices-stats,indices stats>> API
  80. specifically. For example, here is a search body request that associate
  81. the request with two different groups:
  82. [source,js]
  83. --------------------------------------------------
  84. POST /_search
  85. {
  86. "query" : {
  87. "match_all" : {}
  88. },
  89. "stats" : ["group1", "group2"]
  90. }
  91. --------------------------------------------------
  92. // CONSOLE
  93. // TEST[setup:twitter]
  94. [float]
  95. [[global-search-timeout]]
  96. === Global Search Timeout
  97. Individual searches can have a timeout as part of the
  98. <<search-request-body>>. Since search requests can originate from many
  99. sources, Elasticsearch has a dynamic cluster-level setting for a global
  100. search timeout that applies to all search requests that do not set a
  101. timeout in the request body. These requests will be cancelled after
  102. the specified time using the mechanism described in the following section on
  103. <<global-search-cancellation>>. Therefore the same caveats about timeout
  104. responsiveness apply.
  105. The setting key is `search.default_search_timeout` and can be set using the
  106. <<cluster-update-settings>> endpoints. The default value is no global timeout.
  107. Setting this value to `-1` resets the global search timeout to no timeout.
  108. [float]
  109. [[global-search-cancellation]]
  110. === Search Cancellation
  111. Searches can be cancelled using standard <<task-cancellation,task cancellation>>
  112. mechanism. By default, a running search only checks if it is cancelled or
  113. not on segment boundaries, therefore the cancellation can be delayed by large
  114. segments. The search cancellation responsiveness can be improved by setting
  115. the dynamic cluster-level setting `search.low_level_cancellation` to `true`.
  116. However, it comes with an additional overhead of more frequent cancellation
  117. checks that can be noticeable on large fast running search queries. Changing this
  118. setting only affects the searches that start after the change is made.
  119. [float]
  120. [[search-concurrency-and-parallelism]]
  121. === Search concurrency and parallelism
  122. By default Elasticsearch doesn't reject any search requests based on the number
  123. of shards the request hits. While Elasticsearch will optimize the search
  124. execution on the coordinating node a large number of shards can have a
  125. significant impact CPU and memory wise. It is usually a better idea to organize
  126. data in such a way that there are fewer larger shards. In case you would like to
  127. configure a soft limit, you can update the `action.search.shard_count.limit`
  128. cluster setting in order to reject search requests that hit too many shards.
  129. The request parameter `max_concurrent_shard_requests` can be used to control the
  130. maximum number of concurrent shard requests the search API will execute per node
  131. for the request. This parameter should be used to protect a single request from
  132. overloading a cluster (e.g., a default request will hit all indices in a cluster
  133. which could cause shard request rejections if the number of shards per node is
  134. high). This default value is `5`.
  135. include::search/search.asciidoc[]
  136. include::search/uri-request.asciidoc[]
  137. include::search/request-body.asciidoc[]
  138. include::search/search-template.asciidoc[]
  139. include::search/search-shards.asciidoc[]
  140. include::search/suggesters.asciidoc[]
  141. include::search/multi-search.asciidoc[]
  142. include::search/count.asciidoc[]
  143. include::search/validate.asciidoc[]
  144. include::search/explain.asciidoc[]
  145. include::search/profile.asciidoc[]
  146. include::search/field-caps.asciidoc[]
  147. include::search/rank-eval.asciidoc[]