|
@@ -3,11 +3,14 @@
|
|
|
|
|
|
experimental[]
|
|
|
|
|
|
-The field stats api allows one to find statistical properties of a field without executing a search, but
|
|
|
-looking up measurements that are natively available in the Lucene index. This can be useful to explore a dataset which
|
|
|
-you don't know much about. For example, this allows creating a histogram aggregation with meaningful intervals.
|
|
|
+The field stats api allows one to find statistical properties of a field
|
|
|
+without executing a search, but looking up measurements that are natively
|
|
|
+available in the Lucene index. This can be useful to explore a dataset which
|
|
|
+you don't know much about. For example, this allows creating a histogram
|
|
|
+aggregation with meaningful intervals based on the min/max range of values.
|
|
|
|
|
|
-The field stats api by defaults executes on all indices, but can execute on specific indices too.
|
|
|
+The field stats api by defaults executes on all indices, but can execute on
|
|
|
+specific indices too.
|
|
|
|
|
|
All indices:
|
|
|
|
|
@@ -26,15 +29,11 @@ curl -XGET "http://localhost:9200/index1,index2/_field_stats?fields=rating"
|
|
|
Supported request options:
|
|
|
|
|
|
[horizontal]
|
|
|
-`fields`::
|
|
|
-
|
|
|
-A list of fields to compute stats for.
|
|
|
-
|
|
|
-`level`::
|
|
|
-
|
|
|
-Defines if field stats should be returned on a per index level or on a cluster
|
|
|
-wide level. Valid values are `indices` and `cluster`. Defaults to `cluster`.
|
|
|
+`fields`:: A list of fields to compute stats for.
|
|
|
+`level`:: Defines if field stats should be returned on a per index level or on a
|
|
|
+ cluster wide level. Valid values are `indices` and `cluster` (default).
|
|
|
|
|
|
+[float]
|
|
|
=== Field statistics
|
|
|
|
|
|
The field stats api is supported on string based, number based and date based fields and can return the following statistics per field:
|
|
@@ -57,13 +56,13 @@ is a derived statistic and is based on the `max_doc` and `doc_count`.
|
|
|
`sum_doc_freq`::
|
|
|
|
|
|
The sum of each term's document frequency in this field, or -1 if this
|
|
|
-measurement isn't available on one or more shards. Document frequency is the
|
|
|
-number of documents containing a particular term.
|
|
|
+measurement isn't available on one or more shards.
|
|
|
+Document frequency is the number of documents containing a particular term.
|
|
|
|
|
|
`sum_total_term_freq`::
|
|
|
|
|
|
The sum of the term frequencies of all terms in this field across all
|
|
|
-documents, or `-1` if this measurement isn't available on one or more shards.
|
|
|
+documents, or -1 if this measurement isn't available on one or more shards.
|
|
|
Term frequency is the total number of occurrences of a term in a particular
|
|
|
document and field.
|
|
|
|
|
@@ -75,18 +74,19 @@ The lowest value in the field represented in a displayable form.
|
|
|
|
|
|
The highest value in the field represented in a displayable form.
|
|
|
|
|
|
-NOTE: For all the mentioned statistics, documents marked as deleted aren't taken into account. The documents marked
|
|
|
-as deleted are are only taken into account when the segments these documents reside on are merged away.
|
|
|
+NOTE: Documents marked as deleted (but not yet removed by the merge process)
|
|
|
+still affect all the mentioned statistics.
|
|
|
|
|
|
-[float]
|
|
|
-=== Example
|
|
|
+
|
|
|
+.Cluster level field statistics
|
|
|
+==================================================
|
|
|
|
|
|
[source,js]
|
|
|
--------------------------------------------------
|
|
|
-curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name"
|
|
|
+GET /_field_stats?fields=rating,answer_count,creation_date,display_name
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-[source,js]
|
|
|
+[source,json]
|
|
|
--------------------------------------------------
|
|
|
{
|
|
|
"_shards": {
|
|
@@ -140,12 +140,14 @@ curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creati
|
|
|
--------------------------------------------------
|
|
|
|
|
|
<1> The `_all` key indicates that it contains the field stats of all indices in the cluster.
|
|
|
+==================================================
|
|
|
|
|
|
-With level set to `indices`:
|
|
|
+.Indices level field statistics
|
|
|
+==================================================
|
|
|
|
|
|
[source,js]
|
|
|
--------------------------------------------------
|
|
|
-curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name&level=indices"
|
|
|
+GET /_field_stats?fields=rating,answer_count,creation_date,display_name&level=indices
|
|
|
--------------------------------------------------
|
|
|
|
|
|
[source,js]
|
|
@@ -201,4 +203,6 @@ curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creati
|
|
|
}
|
|
|
--------------------------------------------------
|
|
|
|
|
|
-<1> The `stack` key means it contains all field stats for the `stack` index.
|
|
|
+<1> The `stack` key means it contains all field stats for the `stack` index.
|
|
|
+
|
|
|
+==================================================
|