| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292 | [role="xpack"][[ml-count-functions]]=== Count functionsCount functions detect anomalies when the number of events in a bucket isanomalous.Use `non_zero_count` functions if your data is sparse and you want to ignorecases where the bucket count is zero.Use `distinct_count` functions to determine when the number of distinct valuesin one field is unusual, as opposed to the total count.Use high-sided functions if you want to monitor unusually high event rates.Use low-sided functions if you want to look at drops in event rate.The {xpackml} features include the following count functions:* xref:ml-count[`count`, `high_count`, `low_count`]* xref:ml-nonzero-count[`non_zero_count`, `high_non_zero_count`, `low_non_zero_count`]* xref:ml-distinct-count[`distinct_count`, `high_distinct_count`, `low_distinct_count`][float][[ml-count]]===== Count, high_count, low_countThe `count` function detects anomalies when the number of events in a bucket isanomalous.The `high_count` function detects anomalies when the count of events in abucket are unusually high.The `low_count` function detects anomalies when the count of events in abucket are unusually low.These functions support the following properties:* `by_field_name` (optional)* `over_field_name` (optional)* `partition_field_name` (optional)For more information about those properties,see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]..Example 1: Analyzing events with the count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example1{  "analysis_config": {    "detectors": [{      "function" : "count"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]This example is probably the simplest possible analysis. It identifiestime buckets during which the overall count of events is higher or lower thanusual.When you use this function in a detector in your job, it models the event rateand detects when the event rate is unusual compared to its past behavior..Example 2: Analyzing errors with the high_count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example2{  "analysis_config": {    "detectors": [{      "function" : "high_count",      "by_field_name" : "error_code",      "over_field_name": "user"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]If you use this `high_count` function in a detector in your job, itmodels the event rate for each error code. It detects users that generate anunusually high count of error codes compared to other users..Example 3: Analyzing status codes with the low_count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example3{  "analysis_config": {    "detectors": [{      "function" : "low_count",      "by_field_name" : "status_code"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]In this example, the function detects when the count of events for astatus code is lower than usual.When you use this function in a detector in your job, it models the event ratefor each status code and detects when a status code has an unusually low countcompared to its past behavior..Example 4: Analyzing aggregated data with the count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example4{  "analysis_config": {    "summary_count_field_name" : "events_per_min",    "detectors": [{      "function" : "count"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}  --------------------------------------------------// CONSOLE// TEST[skip:needs-licence]If you are analyzing an aggregated `events_per_min` field, do not use a sumfunction (for example, `sum(events_per_min)`). Instead, use the count functionand the `summary_count_field_name` property. For more information, see <<ml-configuring-aggregation>>.[float][[ml-nonzero-count]]===== Non_zero_count, high_non_zero_count, low_non_zero_countThe `non_zero_count` function detects anomalies when the number of events in abucket is anomalous, but it ignores cases where the bucket count is zero. Usethis function if you know your data is sparse or has gaps and the gaps are notimportant.The `high_non_zero_count` function detects anomalies when the number of eventsin a bucket is unusually high and it ignores cases where the bucket count iszero.The `low_non_zero_count` function detects anomalies when the number of events ina bucket is unusually low and it ignores cases where the bucket count is zero.These functions support the following properties:* `by_field_name` (optional)* `partition_field_name` (optional)For more information about those properties,see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].For example, if you have the following number of events per bucket:========================================1,22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,43,31,0,0,0,0,0,0,0,0,0,0,0,0,2,1========================================The `non_zero_count` function models only the following data:========================================1,22,2,43,31,2,1========================================.Example 5: Analyzing signatures with the high_non_zero_count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example5{  "analysis_config": {    "detectors": [{      "function" : "high_non_zero_count",      "by_field_name" : "signaturename"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]If you use this `high_non_zero_count` function in a detector in your job, itmodels the count of events for the `signaturename` field. It ignores any bucketswhere the count is zero and detects when a `signaturename` value has anunusually high count of events compared to its past behavior.NOTE: Population analysis (using an `over_field_name` property value) is notsupported for the `non_zero_count`, `high_non_zero_count`, and`low_non_zero_count` functions. If you want to do population analysis and yourdata is sparse, use the `count` functions, which are optimized for that scenario.[float][[ml-distinct-count]]===== Distinct_count, high_distinct_count, low_distinct_countThe `distinct_count` function detects anomalies where the number of distinctvalues in one field is unusual.The `high_distinct_count` function detects unusually high numbers of distinctvalues in one field.The `low_distinct_count` function detects unusually low numbers of distinctvalues in one field.These functions support the following properties:* `field_name` (required)* `by_field_name` (optional)* `over_field_name` (optional)* `partition_field_name` (optional)For more information about those properties,see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]..Example 6: Analyzing users with the distinct_count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example6{  "analysis_config": {    "detectors": [{      "function" : "distinct_count",      "field_name" : "user"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]This `distinct_count` function detects when a system has an unusual numberof logged in users. When you use this function in a detector in your job, itmodels the distinct count of users. It also detects when the distinct number ofusers is unusual compared to the past..Example 7: Analyzing ports with the high_distinct_count function[source,js]--------------------------------------------------PUT _ml/anomaly_detectors/example7{  "analysis_config": {    "detectors": [{      "function" : "high_distinct_count",      "field_name" : "dst_port",      "over_field_name": "src_ip"    }]  },  "data_description": {    "time_field":"timestamp",    "time_format": "epoch_ms"  }}--------------------------------------------------// CONSOLE// TEST[skip:needs-licence]This example detects instances of port scanning. When you use this function in adetector in your job, it models the distinct count of ports. It also detects the`src_ip` values that connect to an unusually high number of different`dst_ports` values compared to other `src_ip` values.
 |