|
@@ -45,20 +45,21 @@ PUT _ml/anomaly_detectors/farequote
|
|
|
"bucket_span": "60m",
|
|
|
"detectors": [{
|
|
|
"function": "mean",
|
|
|
- "field_name": "responsetime",
|
|
|
- "by_field_name": "airline"
|
|
|
+ "field_name": "responsetime", <1>
|
|
|
+ "by_field_name": "airline" <1>
|
|
|
}],
|
|
|
"summary_count_field_name": "doc_count"
|
|
|
},
|
|
|
"data_description": {
|
|
|
- "time_field":"time"
|
|
|
+ "time_field":"time" <1>
|
|
|
}
|
|
|
}
|
|
|
----------------------------------
|
|
|
// TEST[skip:setup:farequote_data]
|
|
|
|
|
|
-In this example, the `airline`, `responsetime`, and `time` fields are
|
|
|
-aggregations.
|
|
|
+<1> In this example, the `airline`, `responsetime`, and `time` fields are
|
|
|
+aggregations. Only the aggregated fields defined in the `analysis_config` object
|
|
|
+are analyzed by the {anomaly-job}.
|
|
|
|
|
|
NOTE: When the `summary_count_field_name` property is set to a non-null value,
|
|
|
the job expects to receive aggregated input. The property must be set to the
|
|
@@ -81,16 +82,16 @@ PUT _ml/datafeeds/datafeed-farequote
|
|
|
"time_zone": "UTC"
|
|
|
},
|
|
|
"aggregations": {
|
|
|
- "time": {
|
|
|
+ "time": { <1>
|
|
|
"max": {"field": "time"}
|
|
|
},
|
|
|
- "airline": {
|
|
|
+ "airline": { <1>
|
|
|
"terms": {
|
|
|
"field": "airline",
|
|
|
"size": 100
|
|
|
},
|
|
|
"aggregations": {
|
|
|
- "responsetime": {
|
|
|
+ "responsetime": { <1>
|
|
|
"avg": {
|
|
|
"field": "responsetime"
|
|
|
}
|
|
@@ -104,18 +105,13 @@ PUT _ml/datafeeds/datafeed-farequote
|
|
|
----------------------------------
|
|
|
// TEST[skip:setup:farequote_job]
|
|
|
|
|
|
-In this example, the aggregations have names that match the fields that they
|
|
|
+<1> In this example, the aggregations have names that match the fields that they
|
|
|
operate on. That is to say, the `max` aggregation is named `time` and its
|
|
|
field is also `time`. The same is true for the aggregations with the names
|
|
|
-`airline` and `responsetime`. Since you must create the job before you can
|
|
|
-create the {dfeed}, synchronizing your aggregation and field names can simplify
|
|
|
-these configuration steps.
|
|
|
+`airline` and `responsetime`.
|
|
|
|
|
|
-IMPORTANT: If you use a `max` aggregation on a time field, the aggregation name
|
|
|
-in the {dfeed} must match the name of the time field, as in the previous example.
|
|
|
-For all other aggregations, if the aggregation name doesn't match the field name,
|
|
|
-there are limitations in the drill-down functionality within the {ml} page in
|
|
|
-{kib}.
|
|
|
+IMPORTANT: Your {dfeed} can contain multiple aggregations, but only the ones
|
|
|
+with names that match values in the job configuration are fed to the job.
|
|
|
|
|
|
{dfeeds-cap} support complex nested aggregations, this example uses the `derivative`
|
|
|
pipeline aggregation to find the first order derivative of the counter
|
|
@@ -243,8 +239,8 @@ When you define an aggregation in a {dfeed}, it must have the following form:
|
|
|
The top level aggregation must be either a
|
|
|
{ref}/search-aggregations-bucket.html[bucket aggregation] containing as single
|
|
|
sub-aggregation that is a `date_histogram` or the top level aggregation is the
|
|
|
-required `date_histogram`. There must be exactly 1 `date_histogram` aggregation.
|
|
|
-For more information, see
|
|
|
+required `date_histogram`. There must be exactly one `date_histogram`
|
|
|
+aggregation. For more information, see
|
|
|
{ref}/search-aggregations-bucket-datehistogram-aggregation.html[Date histogram aggregation].
|
|
|
|
|
|
NOTE: The `time_zone` parameter in the date histogram aggregation must be set to
|