Преглед изворни кода

Document and test date_range "missing" support (#28983)

* Add a REST integration test that documents date_range support

Add a test case that exercises date_range aggregations using the missing
option.

Addresses #17597

* Test cleanup and correction

Adding a document with a null date to exercise `missing` option, update
test name to something reasonable.

* Update documentation to explain how the "missing" parameter works for
date_range aggregations.

* Wrap lines at 80 chars in docs.

* Change format of test to YAML for readability.
Paul Sanwald пре 7 година
родитељ
комит
6dae955b6a

+ 76 - 17
docs/reference/aggregations/bucket/daterange-aggregation.asciidoc

@@ -1,8 +1,14 @@
 [[search-aggregations-bucket-daterange-aggregation]]
 === Date Range Aggregation
 
-A range aggregation that is dedicated for date values. The main difference between this aggregation and the normal <<search-aggregations-bucket-range-aggregation,range>> aggregation is that the `from` and `to` values can be expressed in <<date-math,Date Math>> expressions, and it is also possible to specify a date format by which the `from` and `to` response fields will be returned.
-Note that this aggregation includes the `from` value and excludes the `to` value for each range.
+A range aggregation that is dedicated for date values. The main difference
+between this aggregation and the normal
+<<search-aggregations-bucket-range-aggregation,range>>
+aggregation is that the `from` and `to` values can be expressed in
+<<date-math,Date Math>> expressions, and it is also possible to specify a date
+format by which the `from` and `to` response fields will be returned.
+Note that this aggregation includes the `from` value and excludes the `to` value
+for each range.
 
 Example:
 
@@ -30,8 +36,9 @@ POST /sales/_search?size=0
 <1> < now minus 10 months, rounded down to the start of the month.
 <2> >= now minus 10 months, rounded down to the start of the month.
 
-In the example above, we created two range buckets, the first will "bucket" all documents dated prior to 10 months ago and
-the second will "bucket" all documents dated since 10 months ago
+In the example above, we created two range buckets, the first will "bucket" all
+documents dated prior to 10 months ago and the second will "bucket" all
+documents dated since 10 months ago
 
 Response:
 
@@ -61,12 +68,52 @@ Response:
 --------------------------------------------------
 // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
 
+==== Missing Values
+
+The `missing` parameter defines how documents that are missing a value should
+be treated. By default they will be ignored but it is also possible to treat
+them as if they had a value. This is done by adding a set of fieldname :
+value mappings to specify default values per field.
+
+[source,js]
+--------------------------------------------------
+POST /sales/_search?size=0
+{
+   "aggs": {
+       "range": {
+           "date_range": {
+               "field": "date",
+               "missing": "1976/11/30",
+               "ranges": [
+                  { 
+                    "key": "Older",
+                    "to": "2016/02/01" 
+                  }, <1>
+                  { 
+                    "key": "Newer",
+                    "from": "2016/02/01", 
+                    "to" : "now/d" 
+                  }
+              ]
+          }
+      }
+   }
+}
+--------------------------------------------------
+// CONSOLE
+// TEST[setup:sales]
+
+<1> Documents without a value in the `date` field will be added to the "Older"
+bucket, as if they had a date value of "1899-12-31". 
+
 [[date-format-pattern]]
 ==== Date Format/Pattern
 
-NOTE: this information was copied from http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[JodaDate]
+NOTE: this information was copied from
+http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[JodaDate]
 
-All ASCII letters are reserved as format pattern letters, which are defined as follows:
+All ASCII letters are reserved as format pattern letters, which are defined
+as follows:
 
 [options="header"]
 |=======
@@ -104,30 +151,41 @@ All ASCII letters are reserved as format pattern letters, which are defined as f
 
 The count of pattern letters determine the format.
 
-Text:: If the number of pattern letters is 4 or more, the full form is used; otherwise a short or abbreviated form is used if available.
+Text:: If the number of pattern letters is 4 or more, the full form is used;
+otherwise a short or abbreviated form is used if available.
 
-Number:: The minimum number of digits. Shorter numbers are zero-padded to this amount.
+Number:: The minimum number of digits. Shorter numbers are zero-padded to
+this amount.
 
-Year:: Numeric presentation for year and weekyear fields are handled specially. For example, if the count of 'y' is 2, the year will be displayed as the zero-based year of the century, which is two digits.
+Year:: Numeric presentation for year and weekyear fields are handled
+specially. For example, if the count of 'y' is 2, the year will be displayed
+as the zero-based year of the century, which is two digits.
 
 Month:: 3 or over, use text, otherwise use number.
 
-Zone:: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with a colon, 'ZZZ' or more outputs the zone id.
+Zone:: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with a
+colon, 'ZZZ' or more outputs the zone id.
 
 Zone names:: Time zone names ('z') cannot be parsed.
 
-Any characters in the pattern that are not in the ranges of ['a'..'z'] and ['A'..'Z'] will be treated as quoted text. For instance, characters like ':', '.', ' ', '#' and '?' will appear in the resulting time text even they are not embraced within single quotes.
+Any characters in the pattern that are not in the ranges of ['a'..'z'] and
+['A'..'Z'] will be treated as quoted text. For instance, characters like ':',
+ '.', ' ', '#' and '?' will appear in the resulting time text even they are
+ not embraced within single quotes.
 
 [[time-zones]]
 ==== Time zone in date range aggregations
 
-Dates can be converted from another time zone to UTC by specifying the `time_zone` parameter.
+Dates can be converted from another time zone to UTC by specifying the
+`time_zone` parameter.
 
-Time zones may either be specified as an ISO 8601 UTC offset (e.g. +01:00 or -08:00) or as one of
-the http://www.joda.org/joda-time/timezones.html[time zone ids] from the TZ database.
+Time zones may either be specified as an ISO 8601 UTC offset (e.g. +01:00 or
+-08:00) or as one of the http://www.joda.org/joda-time/timezones.html [time
+zone ids] from the TZ database.
 
-The `time_zone` parameter is also applied to rounding in date math expressions. As an example,
-to round to the beginning of the day in the CET time zone, you can do the following:
+The `time_zone` parameter is also applied to rounding in date math expressions.
+As an example, to round to the beginning of the day in the CET time zone, you
+can do the following:
 
 [source,js]
 --------------------------------------------------
@@ -156,7 +214,8 @@ POST /sales/_search?size=0
 
 ==== Keyed Response
 
-Setting the `keyed` flag to `true` will associate a unique string key with each bucket and return the ranges as a hash rather than an array:
+Setting the `keyed` flag to `true` will associate a unique string key with each
+bucket and return the ranges as a hash rather than an array:
 
 [source,js]
 --------------------------------------------------

+ 74 - 0
rest-api-spec/src/main/resources/rest-api-spec/test/search.aggregation/40_range.yml

@@ -273,3 +273,77 @@ setup:
   - match: { aggregations.date_range.buckets.1.from: 3000000 }
   - match: { aggregations.date_range.buckets.1.to: 4000000 }
 
+---
+"Date Range Missing":
+  - do:
+      index:
+        index: test
+        type: test
+        id: 1
+        body: { "date" : "28800000000" }
+
+  - do:
+      index:
+        index: test
+        type: test
+        id: 2
+        body: { "date" : "315561600000" }
+
+  - do:
+      index:
+        index: test
+        type: test
+        id: 3
+        body: { "date" : "631180800000" }
+
+  - do:
+      index:
+        index: test
+        type: test
+        id: 4
+        body: { "date" : "-2524492800000" }
+
+  - do:
+        index:
+          index: test
+          type: test
+          id: 5
+          body: { "ip" : "192.168.0.1" }
+
+  - do:
+      indices.refresh: {}
+
+  - do:
+      search:
+        body:
+          aggs:
+            age_groups:
+              date_range:
+                field: date
+                missing: "-2240496000000"
+                ranges:
+                - key: Generation Y
+                  from: '315561600000'
+                  to: '946713600000'
+                - key: Generation X
+                  from: "-157737600000"
+                  to: '315561600000'
+                - key: Other
+                  to: "-2208960000000"
+                  
+  - match: { hits.total: 5 }
+
+  - length: { aggregations.age_groups.buckets: 3 }
+
+  - match: { aggregations.age_groups.buckets.0.key: "Other" }
+
+  - match: { aggregations.age_groups.buckets.0.doc_count: 2 }
+
+  - match: { aggregations.age_groups.buckets.1.key: "Generation X" }
+
+  - match: { aggregations.age_groups.buckets.1.doc_count: 1 }
+
+  - match: { aggregations.age_groups.buckets.2.key: "Generation Y" }
+
+  - match: { aggregations.age_groups.buckets.2.doc_count: 2 }
+