123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294 |
- [[size-your-shards]]
- == How to size your shards
- ++++
- <titleabbrev>Size your shards</titleabbrev>
- ++++
- To protect against hardware failure and increase capacity, {es} stores copies of
- an index’s data across multiple shards on multiple nodes. The number and size of
- these shards can have a significant impact on your cluster's health. One common
- problem is _oversharding_, a situation in which a cluster with a large number of
- shards becomes unstable.
- [discrete]
- [[create-a-sharding-strategy]]
- === Create a sharding strategy
- The best way to prevent oversharding and other shard-related issues
- is to create a sharding strategy. A sharding strategy helps you determine and
- maintain the optimal number of shards for your cluster while limiting the size
- of those shards.
- Unfortunately, there is no one-size-fits-all sharding strategy. A strategy that
- works in one environment may not scale in another. A good sharding strategy must
- account for your infrastructure, use case, and performance expectations.
- The best way to create a sharding strategy is to benchmark your production data
- on production hardware using the same queries and indexing loads you'd see in
- production. For our recommended methodology, watch the
- https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing[quantitative
- cluster sizing video]. As you test different shard configurations, use {kib}'s
- {kibana-ref}/elasticsearch-metrics.html[{es} monitoring tools] to track your
- cluster's stability and performance.
- The following sections provide some reminders and guidelines you should consider
- when designing your sharding strategy. If your cluster has shard-related
- problems, see <<fix-an-oversharded-cluster>>.
- [discrete]
- [[shard-sizing-considerations]]
- === Sizing considerations
- Keep the following things in mind when building your sharding strategy.
- [discrete]
- [[single-thread-per-shard]]
- ==== Searches run on a single thread per shard
- Most searches hit multiple shards. Each shard runs the search on a single
- CPU thread. While a shard can run multiple concurrent searches, searches across a
- large number of shards can deplete a node's <<modules-threadpool,search
- thread pool>>. This can result in low throughput and slow search speeds.
- [discrete]
- [[each-shard-has-overhead]]
- ==== Each shard has overhead
- Every shard uses memory and CPU resources. In most cases, a small
- set of large shards uses fewer resources than many small shards.
- Segments play a big role in a shard's resource usage. Most shards contain
- several segments, which store its index data. {es} keeps segment metadata in
- JVM heap memory so it can be quickly retrieved for searches. As a
- shard grows, its segments are <<index-modules-merge,merged>> into fewer, larger
- segments. This decreases the number of segments, which means less metadata is
- kept in heap memory.
- [discrete]
- [[shard-auto-balance]]
- ==== {es} automatically balances shards within a data tier
- A cluster's nodes are grouped into <<data-tiers,data tiers>>. Within each tier,
- {es} attempts to spread an index's shards across as many nodes as possible. When
- you add a new node or a node fails, {es} automatically rebalances the index's
- shards across the tier's remaining nodes.
- [discrete]
- [[shard-size-best-practices]]
- === Best practices
- Where applicable, use the following best practices as starting points for your
- sharding strategy.
- [discrete]
- [[delete-indices-not-documents]]
- ==== Delete indices, not documents
- Deleted documents aren't immediately removed from {es}'s file system.
- Instead, {es} marks the document as deleted on each related shard. The marked
- document will continue to use resources until it's removed during a periodic
- <<index-modules-merge,segment merge>>.
- When possible, delete entire indices instead. {es} can immediately remove
- deleted indices directly from the file system and free up resources.
- [discrete]
- [[use-ds-ilm-for-time-series]]
- ==== Use data streams and {ilm-init} for time series data
- <<data-streams,Data streams>> let you store time series data across multiple,
- time-based backing indices. You can use <<index-lifecycle-management,{ilm}
- ({ilm-init})>> to automatically manage these backing indices.
- [role="screenshot"]
- image:images/ilm/index-lifecycle-policies.png[]
- One advantage of this setup is
- <<getting-started-index-lifecycle-management,automatic rollover>>, which creates
- a new write index when the current one meets a defined `max_age`, `max_docs`, or
- `max_size` threshold. You can use these thresholds to create indices based on
- your retention intervals. When an index is no longer needed, you can use
- {ilm-init} to automatically delete it and free up resources.
- {ilm-init} also makes it easy to change your sharding strategy over time:
- * *Want to decrease the shard count for new indices?* +
- Change the <<index-number-of-shards,`index.number_of_shards`>> setting in the
- data stream's <<data-streams-change-mappings-and-settings,matching index
- template>>.
- * *Want larger shards?* +
- Increase your {ilm-init} policy's <<ilm-rollover,rollover threshold>>.
- * *Need indices that span shorter intervals?* +
- Offset the increased shard count by deleting older indices sooner. You can do
- this by lowering the `min_age` threshold for your policy's
- <<ilm-index-lifecycle,delete phase>>.
- Every new backing index is an opportunity to further tune your strategy.
- [discrete]
- [[shard-size-recommendation]]
- ==== Aim for shard sizes between 10GB and 50GB
- Shards larger than 50GB may make a cluster less likely to recover from failure.
- When a node fails, {es} rebalances the node's shards across the data tier's
- remaining nodes. Shards larger than 50GB can be harder to move across a network
- and may tax node resources.
- [discrete]
- [[shard-count-recommendation]]
- ==== Aim for 20 shards or fewer per GB of heap memory
- The number of shards a node can hold is proportional to the node's
- heap memory. For example, a node with 30GB of heap memory should
- have at most 600 shards. The further below this limit you can keep your nodes,
- the better. If you find your nodes exceeding more than 20 shards per GB,
- consider adding another node.
- To check the current size of each node's heap, use the <<cat-nodes,cat nodes
- API>>.
- [source,console]
- ----
- GET _cat/nodes?v=true&h=heap.current
- ----
- // TEST[setup:my_index]
- You can use the <<cat-shards,cat shards API>> to check the number of shards per
- node.
- [source,console]
- ----
- GET _cat/shards
- ----
- // TEST[setup:my_index]
- [discrete]
- [[avoid-node-hotspots]]
- ==== Avoid node hotspots
- If too many shards are allocated to a specific node, the node can become a
- hotspot. For example, if a single node contains too many shards for an index
- with a high indexing volume, the node is likely to have issues.
- To prevent hotspots, use the
- <<total-shards-per-node,`index.routing.allocation.total_shards_per_node`>> index
- setting to explicitly limit the number of shards on a single node. You can
- configure `index.routing.allocation.total_shards_per_node` using the
- <<indices-update-settings,update index settings API>>.
- [source,console]
- --------------------------------------------------
- PUT /my-index-000001/_settings
- {
- "index" : {
- "routing.allocation.total_shards_per_node" : 5
- }
- }
- --------------------------------------------------
- // TEST[setup:my_index]
- [discrete]
- [[fix-an-oversharded-cluster]]
- === Fix an oversharded cluster
- If your cluster is experiencing stability issues due to oversharded indices,
- you can use one or more of the following methods to fix them.
- [discrete]
- [[reindex-indices-from-shorter-periods-into-longer-periods]]
- ==== Create time-based indices that cover longer periods
- For time series data, you can create indices that cover longer time intervals.
- For example, instead of daily indices, you can create indices on a monthly or
- yearly basis.
- If you're using {ilm-init}, you can do this by increasing the `max_age`
- threshold for the <<ilm-rollover,rollover action>>.
- If your retention policy allows it, you can also create larger indices by
- omitting a `max_age` threshold and using `max_docs` and/or `max_size`
- thresholds instead.
- [discrete]
- [[delete-empty-indices]]
- ==== Delete empty or unneeded indices
- If you're using {ilm-init} and roll over indices based on a `max_age` threshold,
- you can inadvertently create indices with no documents. These empty indices
- provide no benefit but still consume resources.
- You can find these empty indices using the <<cat-count,cat count API>>.
- [source,console]
- ----
- GET /_cat/count/my-index-000001?v=true
- ----
- // TEST[setup:my_index]
- Once you have a list of empty indices, you can delete them using the
- <<indices-delete-index,delete index API>>. You can also delete any other
- unneeded indices.
- [source,console]
- ----
- DELETE /my-index-*
- ----
- // TEST[setup:my_index]
- [discrete]
- [[force-merge-during-off-peak-hours]]
- ==== Force merge during off-peak hours
- If you no longer write to an index, you can use the <<indices-forcemerge,force
- merge API>> to <<index-modules-merge,merge>> smaller segments into larger ones.
- This can reduce shard overhead and improve search speeds. However, force merges
- are resource-intensive. If possible, run the force merge during off-peak hours.
- [source,console]
- ----
- POST /my-index-000001/_forcemerge
- ----
- // TEST[setup:my_index]
- [discrete]
- [[shrink-existing-index-to-fewer-shards]]
- ==== Shrink an existing index to fewer shards
- If you no longer write to an index, you can use the
- <<indices-shrink-index,shrink index API>> to reduce its shard count.
- [source,console]
- ----
- POST /my-index-000001/_shrink/my-shrunken-index-000001
- ----
- // TEST[s/^/PUT my-index-000001\n{"settings":{"index.number_of_shards":2,"blocks.write":true}}\n/]
- {ilm-init} also has a <<ilm-shrink,shrink action>> for indices in the
- warm phase.
- [discrete]
- [[combine-smaller-indices]]
- ==== Combine smaller indices
- You can also use the <<docs-reindex,reindex API>> to combine indices
- with similar mappings into a single large index. For time series data, you could
- reindex indices for short time periods into a new index covering a
- longer period. For example, you could reindex daily indices from October with a
- shared index pattern, such as `my-index-2099.10.11`, into a monthly
- `my-index-2099.10` index. After the reindex, delete the smaller indices.
- [source,console]
- ----
- POST /_reindex
- {
- "source": {
- "index": "my-index-2099.10.*"
- },
- "dest": {
- "index": "my-index-2099.10"
- }
- }
- ----
|