Browse Source

[DOCS] Update routing formulas (#76203)

The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for #18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes #76072
James Rodewig 4 years ago
parent
commit
32a516807a

+ 3 - 1
docs/reference/index-modules.asciidoc

@@ -47,8 +47,10 @@ NOTE: The number of shards are limited to `1024` per index. This limitation is a
 `index.number_of_routing_shards`::
 +
 ====
-Number of routing shards used to <<indices-split-index,split>> an index.
+Integer value used with <<index-number-of-shards,`index.number_of_shards`>> to
+route documents to a primary shard. See <<mapping-routing-field>>.
 
+{es} uses this value when <<indices-split-index,splitting>> an index.
 For example, a 5 shard index with `number_of_routing_shards` set to `30` (`5 x
 2 x 3`) could be split by a factor of `2` or `3`. In other words, it could be
 split as follows:

+ 12 - 6
docs/reference/mapping/fields/routing-field.asciidoc

@@ -2,12 +2,17 @@
 === `_routing` field
 
 A document is routed to a particular shard in an index using the following
-formula:
+formulas:
+    
+    routing_factor = num_routing_shards / num_primary_shards
+    shard_num = (hash(_routing) % num_routing_shards) / routing_factor
 
-    shard_num = hash(_routing) % num_primary_shards
-
-The default value used for `_routing` is the document's <<mapping-id-field,`_id`>>.
+`num_routing_shards` is the value of the
+<<index-number-of-routing-shards,`index.number_of_routing_shards`>> index
+setting. `num_primary_shards` is the value of the
+<<index-number-of-shards,`index.number_of_shards`>> index setting.
 
+The default `_routing` value is the document's <<mapping-id-field,`_id`>>.
 Custom routing patterns can be implemented by specifying a custom `routing`
 value per document. For instance:
 
@@ -118,9 +123,10 @@ This is done by providing the index level setting <<routing-partition-size,`inde
 As the partition size increases, the more evenly distributed the data will become at the
 expense of having to search more shards per request.
 
-When this setting is present, the formula for calculating the shard becomes:
+When this setting is present, the formulas for calculating the shard become:
 
-    shard_num = (hash(_routing) + hash(_id) % routing_partition_size) % num_primary_shards
+    routing_value = hash(_routing) + hash(_id) % routing_partition_size
+    shard_num = (routing_value % num_routing_shards) / routing_factor
 
 That is, the `_routing` field is used to calculate a set of shards within the index and then the
 `_id` is used to pick a shard within that set.