4 年之前 · 2b9ed7d36f
--- a/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc
+++ b/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc
@@ -486,12 +486,11 @@ The above aggregation would only return tags which have been found in 10 hits or
 
				 
			
 
				 Terms that score highly will be collected on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global term frequencies available. The decision if a term is added to a candidate list depends only on the score computed on the shard using local shard frequencies, not the global frequencies of the word. The `min_doc_count` criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very _certain_ about if the term will actually reach the required `min_doc_count`. This might cause many (globally) high frequent terms to be missing in the final result if low frequent but high scoring terms populated the candidate lists. To avoid this, the `shard_size` parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
			
 
				 
			
 
				-`shard_min_doc_count` parameter
			
 
				-
			
 
				-The parameter `shard_min_doc_count` regulates the _certainty_ a shard has if the term should actually be added to the candidate list or not with respect to the `min_doc_count`. Terms will only be considered if their local shard frequency within the set is higher than the `shard_min_doc_count`. If your dictionary contains many low frequent words and you are not interested in these (for example misspellings), then you can set the `shard_min_doc_count` parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required `min_doc_count` even after merging the local frequencies. `shard_min_doc_count` is set to `1` per default and has no effect unless you explicitly set it.
			
 
				-
			
 
				 
			
 
				+[[search-aggregations-bucket-significantterms-shard-min-doc-count]]
			
 
				+===== `shard_min_doc_count`
			
 
				 
			
 
				+include::terms-aggregation.asciidoc[tag=min-doc-count]
			
 
				 
			
 
				 WARNING: Setting `min_doc_count` to `1` is generally not advised as it tends to return terms that
			
 
				          are typos or other bizarre curiosities. Finding more than one instance of a term helps
			
--- a/docs/reference/aggregations/bucket/significanttext-aggregation.asciidoc
+++ b/docs/reference/aggregations/bucket/significanttext-aggregation.asciidoc
@@ -393,17 +393,10 @@ This might cause many (globally) high frequent terms to be missing in the final
 
				 the candidate lists. To avoid this, the `shard_size` parameter can be increased to allow more candidate terms on the shards. 
			
 
				 However, this increases memory consumption and network traffic.
			
 
				 
			
 
				-`shard_min_doc_count` parameter
			
 
				-
			
 
				-The parameter `shard_min_doc_count` regulates the _certainty_ a shard has if the term should actually be added to the candidate list or 
			
 
				-not with respect to the `min_doc_count`. Terms will only be considered if their local shard frequency within the set is higher than the 
			
 
				-`shard_min_doc_count`. If your dictionary contains many low frequent words and you are not interested in these (for example misspellings), 
			
 
				-then you can set the `shard_min_doc_count` parameter to filter out candidate terms on a shard level that will with a reasonable certainty 
			
 
				-not reach the required `min_doc_count` even after merging the local frequencies. `shard_min_doc_count` is set to `1` per default and has 
			
 
				-no effect unless you explicitly set it.
			
 
				-
			
 
				-
			
 
				+[[search-aggregations-bucket-significanttext-shard-min-doc-count]]
			
 
				+====== `shard_min_doc_count`
			
 
				 
			
 
				+include::terms-aggregation.asciidoc[tag=min-doc-count]
			
 
				 
			
 
				 WARNING: Setting `min_doc_count` to `1` is generally not advised as it tends to return terms that
			
 
				          are typos or other bizarre curiosities. Finding more than one instance of a term helps
			
--- a/docs/reference/aggregations/bucket/terms-aggregation.asciidoc
+++ b/docs/reference/aggregations/bucket/terms-aggregation.asciidoc
@@ -386,10 +386,12 @@ The above aggregation would only return tags which have been found in 10 hits or
 
				 
			
 
				 Terms are collected and ordered on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global document count available. The decision if a term is added to a candidate list depends only on the order computed on the shard using local shard frequencies. The `min_doc_count` criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very _certain_ about if the term will actually reach the required `min_doc_count`. This might cause many (globally) high frequent terms to be missing in the final result if low frequent terms populated the candidate lists. To avoid this, the `shard_size` parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
			
 
				 
			
 
				-`shard_min_doc_count` parameter
			
 
				+[[search-aggregations-bucket-terms-shard-min-doc-count]]
			
 
				+===== `shard_min_doc_count`
			
 
				 
			
 
				+// tag::min-doc-count[]
			
 
				 The parameter `shard_min_doc_count` regulates the _certainty_ a shard has if the term should actually be added to the candidate list or not with respect to the `min_doc_count`. Terms will only be considered if their local shard frequency within the set is higher than the `shard_min_doc_count`. If your dictionary contains many low frequent terms and you are not interested in those (for example misspellings), then you can set the `shard_min_doc_count` parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required `min_doc_count` even after merging the local counts. `shard_min_doc_count` is set to `0` per default and has no effect unless you explicitly set it.
			
 
				-
			
 
				+// end::min-doc-count[]
			
 
				 
			
 
				 
			
 
				 NOTE:    Setting `min_doc_count`=`0` will also return buckets for terms that didn't match any hit. However, some of