doc-count-field.asciidoc 3.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
  1. [[mapping-doc-count-field]]
  2. === `_doc_count` field
  3. Bucket aggregations always return a field named `doc_count` showing the number of documents that were aggregated and partitioned
  4. in each bucket. Computation of the value of `doc_count` is very simple. `doc_count` is incremented by 1 for every document collected
  5. in each bucket.
  6. While this simple approach is effective when computing aggregations over individual documents, it fails to accurately represent
  7. documents that store pre-aggregated data (such as `histogram` or `aggregate_metric_double` fields), because one summary field may
  8. represent multiple documents.
  9. To allow for correct computation of the number of documents when working with pre-aggregated data, we have introduced a
  10. metadata field type named `_doc_count`. `_doc_count` must always be a positive integer representing the number of documents
  11. aggregated in a single summary field.
  12. When field `_doc_count` is added to a document, all bucket aggregations will respect its value and increment the bucket `doc_count`
  13. by the value of the field. If a document does not contain any `_doc_count` field, `_doc_count = 1` is implied by default.
  14. [IMPORTANT]
  15. ========
  16. * A `_doc_count` field can only store a single positive integer per document. Nested arrays are not allowed.
  17. * If a document contains no `_doc_count` fields, aggregators will increment by 1, which is the default behavior.
  18. ========
  19. [[mapping-doc-count-field-example]]
  20. ==== Example
  21. The following <<indices-create-index, create index>> API request creates a new index with the following field mappings:
  22. * `my_histogram`, a `histogram` field used to store percentile data
  23. * `my_text`, a `keyword` field used to store a title for the histogram
  24. [source,console]
  25. --------------------------------------------------
  26. PUT my_index
  27. {
  28. "mappings" : {
  29. "properties" : {
  30. "my_histogram" : {
  31. "type" : "histogram"
  32. },
  33. "my_text" : {
  34. "type" : "keyword"
  35. }
  36. }
  37. }
  38. }
  39. --------------------------------------------------
  40. The following <<docs-index_,index>> API requests store pre-aggregated data for
  41. two histograms: `histogram_1` and `histogram_2`.
  42. [source,console]
  43. --------------------------------------------------
  44. PUT my_index/_doc/1
  45. {
  46. "my_text" : "histogram_1",
  47. "my_histogram" : {
  48. "values" : [0.1, 0.2, 0.3, 0.4, 0.5],
  49. "counts" : [3, 7, 23, 12, 6]
  50. },
  51. "_doc_count": 45 <1>
  52. }
  53. PUT my_index/_doc/2
  54. {
  55. "my_text" : "histogram_2",
  56. "my_histogram" : {
  57. "values" : [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
  58. "counts" : [8, 17, 8, 7, 6, 2]
  59. },
  60. "_doc_count": 62 <1>
  61. }
  62. --------------------------------------------------
  63. <1> Field `_doc_count` must be a positive integer storing the number of documents aggregated to produce each histogram.
  64. If we run the following <<search-aggregations-bucket-terms-aggregation, terms aggregation>> on `my_index`:
  65. [source,console]
  66. --------------------------------------------------
  67. GET /_search
  68. {
  69. "aggs" : {
  70. "histogram_titles" : {
  71. "terms" : { "field" : "my_text" }
  72. }
  73. }
  74. }
  75. --------------------------------------------------
  76. We will get the following response:
  77. [source,console-result]
  78. --------------------------------------------------
  79. {
  80. ...
  81. "aggregations" : {
  82. "histogram_titles" : {
  83. "doc_count_error_upper_bound": 0,
  84. "sum_other_doc_count": 0,
  85. "buckets" : [
  86. {
  87. "key" : "histogram_2",
  88. "doc_count" : 62
  89. },
  90. {
  91. "key" : "histogram_1",
  92. "doc_count" : 45
  93. }
  94. ]
  95. }
  96. }
  97. }
  98. --------------------------------------------------
  99. // TESTRESPONSE[skip:test not setup]