size-your-shards.asciidoc 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396
  1. [[size-your-shards]]
  2. == Size your shards
  3. Each index in {es} is divided into one or more shards, each of which may be
  4. replicated across multiple nodes to protect against hardware failures. If you
  5. are using <<data-streams>> then each data stream is backed by a sequence of
  6. indices. There is a limit to the amount of data you can store on a single node
  7. so you can increase the capacity of your cluster by adding nodes and increasing
  8. the number of indices and shards to match. However, each index and shard has
  9. some overhead and if you divide your data across too many shards then the
  10. overhead can become overwhelming. A cluster with too many indices or shards is
  11. said to suffer from _oversharding_. An oversharded cluster will be less
  12. efficient at responding to searches and in extreme cases it may even become
  13. unstable.
  14. [discrete]
  15. [[create-a-sharding-strategy]]
  16. === Create a sharding strategy
  17. The best way to prevent oversharding and other shard-related issues is to
  18. create a sharding strategy. A sharding strategy helps you determine and
  19. maintain the optimal number of shards for your cluster while limiting the size
  20. of those shards.
  21. Unfortunately, there is no one-size-fits-all sharding strategy. A strategy that
  22. works in one environment may not scale in another. A good sharding strategy
  23. must account for your infrastructure, use case, and performance expectations.
  24. The best way to create a sharding strategy is to benchmark your production data
  25. on production hardware using the same queries and indexing loads you'd see in
  26. production. For our recommended methodology, watch the
  27. https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing[quantitative
  28. cluster sizing video]. As you test different shard configurations, use {kib}'s
  29. {kibana-ref}/elasticsearch-metrics.html[{es} monitoring tools] to track your
  30. cluster's stability and performance.
  31. The following sections provide some reminders and guidelines you should
  32. consider when designing your sharding strategy. If your cluster is already
  33. oversharded, see <<reduce-cluster-shard-count>>.
  34. [discrete]
  35. [[shard-sizing-considerations]]
  36. === Sizing considerations
  37. Keep the following things in mind when building your sharding strategy.
  38. [discrete]
  39. [[single-thread-per-shard]]
  40. ==== Searches run on a single thread per shard
  41. Most searches hit multiple shards. Each shard runs the search on a single
  42. CPU thread. While a shard can run multiple concurrent searches, searches across a
  43. large number of shards can deplete a node's <<modules-threadpool,search
  44. thread pool>>. This can result in low throughput and slow search speeds.
  45. [discrete]
  46. [[each-shard-has-overhead]]
  47. ==== Each index and shard has overhead
  48. Every index and every shard requires some memory and CPU resources. In most
  49. cases, a small set of large shards uses fewer resources than many small shards.
  50. Segments play a big role in a shard's resource usage. Most shards contain
  51. several segments, which store its index data. {es} keeps segment metadata in
  52. JVM heap memory so it can be quickly retrieved for searches. As a shard grows,
  53. its segments are <<index-modules-merge,merged>> into fewer, larger segments.
  54. This decreases the number of segments, which means less metadata is kept in
  55. heap memory.
  56. Every mapped field also carries some overhead in terms of memory usage and disk
  57. space. By default {es} will automatically create a mapping for every field in
  58. every document it indexes, but you can switch off this behaviour to
  59. <<explicit-mapping,take control of your mappings>>.
  60. [discrete]
  61. [[shard-auto-balance]]
  62. ==== {es} automatically balances shards within a data tier
  63. A cluster's nodes are grouped into <<data-tiers,data tiers>>. Within each tier,
  64. {es} attempts to spread an index's shards across as many nodes as possible. When
  65. you add a new node or a node fails, {es} automatically rebalances the index's
  66. shards across the tier's remaining nodes.
  67. [discrete]
  68. [[shard-size-best-practices]]
  69. === Best practices
  70. Where applicable, use the following best practices as starting points for your
  71. sharding strategy.
  72. [discrete]
  73. [[delete-indices-not-documents]]
  74. ==== Delete indices, not documents
  75. Deleted documents aren't immediately removed from {es}'s file system.
  76. Instead, {es} marks the document as deleted on each related shard. The marked
  77. document will continue to use resources until it's removed during a periodic
  78. <<index-modules-merge,segment merge>>.
  79. When possible, delete entire indices instead. {es} can immediately remove
  80. deleted indices directly from the file system and free up resources.
  81. [discrete]
  82. [[use-ds-ilm-for-time-series]]
  83. ==== Use data streams and {ilm-init} for time series data
  84. <<data-streams,Data streams>> let you store time series data across multiple,
  85. time-based backing indices. You can use <<index-lifecycle-management,{ilm}
  86. ({ilm-init})>> to automatically manage these backing indices.
  87. One advantage of this setup is
  88. <<getting-started-index-lifecycle-management,automatic rollover>>, which creates
  89. a new write index when the current one meets a defined `max_primary_shard_size`,
  90. `max_age`, `max_docs`, or `max_size` threshold. When an index is no longer
  91. needed, you can use {ilm-init} to automatically delete it and free up resources.
  92. {ilm-init} also makes it easy to change your sharding strategy over time:
  93. * *Want to decrease the shard count for new indices?* +
  94. Change the <<index-number-of-shards,`index.number_of_shards`>> setting in the
  95. data stream's <<data-streams-change-mappings-and-settings,matching index
  96. template>>.
  97. * *Want larger shards or fewer backing indices?* +
  98. Increase your {ilm-init} policy's <<ilm-rollover,rollover threshold>>.
  99. * *Need indices that span shorter intervals?* +
  100. Offset the increased shard count by deleting older indices sooner. You can do
  101. this by lowering the `min_age` threshold for your policy's
  102. <<ilm-index-lifecycle,delete phase>>.
  103. Every new backing index is an opportunity to further tune your strategy.
  104. [discrete]
  105. [[shard-size-recommendation]]
  106. ==== Aim for shard sizes between 10GB and 50GB
  107. Larger shards take longer to recover after a failure. When a node fails, {es}
  108. rebalances the node's shards across the data tier's remaining nodes. This
  109. recovery process typically involves copying the shard contents across the
  110. network, so a 100GB shard will take twice as long to recover than a 50GB shard.
  111. In contrast, small shards carry proportionally more overhead and are less
  112. efficient to search. Searching fifty 1GB shards will take substantially more
  113. resources than searching a single 50GB shard containing the same data.
  114. There are no hard limits on shard size, but experience shows that shards
  115. between 10GB and 50GB typically work well for logs and time series data. You
  116. may be able to use larger shards depending on your network and use case.
  117. Smaller shards may be appropriate for
  118. {enterprise-search-ref}/index.html[Enterprise Search] and similar use cases.
  119. If you use {ilm-init}, set the <<ilm-rollover,rollover action>>'s
  120. `max_primary_shard_size` threshold to `50gb` to avoid shards larger than 50GB.
  121. To see the current size of your shards, use the <<cat-shards,cat shards API>>.
  122. [source,console]
  123. ----
  124. GET _cat/shards?v=true&h=index,prirep,shard,store&s=prirep,store&bytes=gb
  125. ----
  126. // TEST[setup:my_index]
  127. The `pri.store.size` value shows the combined size of all primary shards for
  128. the index.
  129. [source,txt]
  130. ----
  131. index prirep shard store
  132. .ds-my-data-stream-2099.05.06-000001 p 0 50gb
  133. ...
  134. ----
  135. // TESTRESPONSE[non_json]
  136. // TESTRESPONSE[s/\.ds-my-data-stream-2099\.05\.06-000001/my-index-000001/]
  137. // TESTRESPONSE[s/50gb/.*/]
  138. [discrete]
  139. [[shard-count-recommendation]]
  140. ==== Aim for 20 shards or fewer per GB of heap memory
  141. The number of shards a data node can hold is proportional to the node's heap
  142. memory. For example, a node with 30GB of heap memory should have at most 600
  143. shards. The further below this limit you can keep your nodes, the better. If
  144. you find your nodes exceeding more than 20 shards per GB, consider adding
  145. another node.
  146. Some system indices for {enterprise-search-ref}/index.html[Enterprise Search]
  147. are nearly empty and rarely used. Due to their low overhead, you shouldn't
  148. count shards for these indices toward a node's shard limit.
  149. To check the configured size of each node's heap, use the <<cat-nodes,cat nodes
  150. API>>.
  151. [source,console]
  152. ----
  153. GET _cat/nodes?v=true&h=heap.max
  154. ----
  155. // TEST[setup:my_index]
  156. You can use the <<cat-shards,cat shards API>> to check the number of shards per
  157. node.
  158. [source,console]
  159. ----
  160. GET _cat/shards?v=true
  161. ----
  162. // TEST[setup:my_index]
  163. [discrete]
  164. [[avoid-node-hotspots]]
  165. ==== Avoid node hotspots
  166. If too many shards are allocated to a specific node, the node can become a
  167. hotspot. For example, if a single node contains too many shards for an index
  168. with a high indexing volume, the node is likely to have issues.
  169. To prevent hotspots, use the
  170. <<total-shards-per-node,`index.routing.allocation.total_shards_per_node`>> index
  171. setting to explicitly limit the number of shards on a single node. You can
  172. configure `index.routing.allocation.total_shards_per_node` using the
  173. <<indices-update-settings,update index settings API>>.
  174. [source,console]
  175. --------------------------------------------------
  176. PUT my-index-000001/_settings
  177. {
  178. "index" : {
  179. "routing.allocation.total_shards_per_node" : 5
  180. }
  181. }
  182. --------------------------------------------------
  183. // TEST[setup:my_index]
  184. [discrete]
  185. [[avoid-unnecessary-fields]]
  186. ==== Avoid unnecessary mapped fields
  187. By default {es} <<dynamic-mapping,automatically creates a mapping>> for every
  188. field in every document it indexes. Every mapped field corresponds to some data
  189. structures on disk which are needed for efficient search, retrieval, and
  190. aggregations on this field. Details about each mapped field are also held in
  191. memory. In many cases this overhead is unnecessary because a field is not used
  192. in any searches or aggregations. Use <<explicit-mapping>> instead of dynamic
  193. mapping to avoid creating fields that are never used. If a collection of fields
  194. are typically used together, consider using <<copy-to>> to consolidate them at
  195. index time. If a field is only rarely used, it may be better to make it a
  196. <<runtime,Runtime field>> instead.
  197. You can get information about which fields are being used with the
  198. <<field-usage-stats>> API, and you can analyze the disk usage of mapped fields
  199. using the <<indices-disk-usage>> API. Note however that unnecessary mapped
  200. fields also carry some memory overhead as well as their disk usage.
  201. [discrete]
  202. [[reduce-cluster-shard-count]]
  203. === Reduce a cluster's shard count
  204. If your cluster is already oversharded, you can use one or more of the following
  205. methods to reduce its shard count.
  206. [discrete]
  207. [[create-indices-that-cover-longer-time-periods]]
  208. ==== Create indices that cover longer time periods
  209. If you use {ilm-init} and your retention policy allows it, avoid using a
  210. `max_age` threshold for the rollover action. Instead, use
  211. `max_primary_shard_size` to avoid creating empty indices or many small shards.
  212. If your retention policy requires a `max_age` threshold, increase it to create
  213. indices that cover longer time intervals. For example, instead of creating daily
  214. indices, you can create indices on a weekly or monthly basis.
  215. [discrete]
  216. [[delete-empty-indices]]
  217. ==== Delete empty or unneeded indices
  218. If you're using {ilm-init} and roll over indices based on a `max_age` threshold,
  219. you can inadvertently create indices with no documents. These empty indices
  220. provide no benefit but still consume resources.
  221. You can find these empty indices using the <<cat-count,cat count API>>.
  222. [source,console]
  223. ----
  224. GET _cat/count/my-index-000001?v=true
  225. ----
  226. // TEST[setup:my_index]
  227. Once you have a list of empty indices, you can delete them using the
  228. <<indices-delete-index,delete index API>>. You can also delete any other
  229. unneeded indices.
  230. [source,console]
  231. ----
  232. DELETE my-index-000001
  233. ----
  234. // TEST[setup:my_index]
  235. [discrete]
  236. [[force-merge-during-off-peak-hours]]
  237. ==== Force merge during off-peak hours
  238. If you no longer write to an index, you can use the <<indices-forcemerge,force
  239. merge API>> to <<index-modules-merge,merge>> smaller segments into larger ones.
  240. This can reduce shard overhead and improve search speeds. However, force merges
  241. are resource-intensive. If possible, run the force merge during off-peak hours.
  242. [source,console]
  243. ----
  244. POST my-index-000001/_forcemerge
  245. ----
  246. // TEST[setup:my_index]
  247. [discrete]
  248. [[shrink-existing-index-to-fewer-shards]]
  249. ==== Shrink an existing index to fewer shards
  250. If you no longer write to an index, you can use the
  251. <<indices-shrink-index,shrink index API>> to reduce its shard count.
  252. {ilm-init} also has a <<ilm-shrink,shrink action>> for indices in the
  253. warm phase.
  254. [discrete]
  255. [[combine-smaller-indices]]
  256. ==== Combine smaller indices
  257. You can also use the <<docs-reindex,reindex API>> to combine indices
  258. with similar mappings into a single large index. For time series data, you could
  259. reindex indices for short time periods into a new index covering a
  260. longer period. For example, you could reindex daily indices from October with a
  261. shared index pattern, such as `my-index-2099.10.11`, into a monthly
  262. `my-index-2099.10` index. After the reindex, delete the smaller indices.
  263. [source,console]
  264. ----
  265. POST _reindex
  266. {
  267. "source": {
  268. "index": "my-index-2099.10.*"
  269. },
  270. "dest": {
  271. "index": "my-index-2099.10"
  272. }
  273. }
  274. ----
  275. [discrete]
  276. [[troubleshoot-shard-related-errors]]
  277. === Troubleshoot shard-related errors
  278. Here’s how to resolve common shard-related errors.
  279. [discrete]
  280. ==== this action would add [x] total shards, but this cluster currently has [y]/[z] maximum shards open;
  281. The <<cluster-max-shards-per-node,`cluster.max_shards_per_node`>> cluster
  282. setting limits the maximum number of open shards for a cluster. This error
  283. indicates an action would exceed this limit.
  284. If you're confident your changes won't destabilize the cluster, you can
  285. temporarily increase the limit using the <<cluster-update-settings,cluster
  286. update settings API>> and retry the action.
  287. [source,console]
  288. ----
  289. PUT _cluster/settings
  290. {
  291. "persistent" : {
  292. "cluster.max_shards_per_node": 1200
  293. }
  294. }
  295. ----
  296. This increase should only be temporary. As a long-term solution, we recommend
  297. you add nodes to the oversharded data tier or
  298. <<reduce-cluster-shard-count,reduce your cluster's shard count>>. To get a
  299. cluster's current shard count after making changes, use the
  300. <<cluster-stats,cluster stats API>>.
  301. [source,console]
  302. ----
  303. GET _cluster/stats?filter_path=indices.shards.total
  304. ----
  305. When a long-term solution is in place, we recommend you reset the
  306. `cluster.max_shards_per_node` limit.
  307. [source,console]
  308. ----
  309. PUT _cluster/settings
  310. {
  311. "persistent" : {
  312. "cluster.max_shards_per_node": null
  313. }
  314. }
  315. ----