| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136 | ["appendix",role="exclude",id="ml-rare-functions"]= Rare functionsThe rare functions detect values that occur rarely in time or rarely for apopulation.The `rare` analysis detects anomalies according to the number of distinct rarevalues. This differs from `freq_rare`, which detects anomalies according to thenumber of times (frequency) rare values occur.[NOTE]====* The `rare` and `freq_rare` functions should not be used in conjunction with`exclude_frequent`.* You cannot create forecasts for {anomaly-jobs} that contain `rare` or`freq_rare` functions. * You cannot add rules with conditions to detectors that use `rare` or `freq_rare` functions. * Shorter bucket spans (less than 1 hour, for example) are recommended whenlooking for rare events. The functions model whether something happens in abucket at least once. With longer bucket spans, it is more likely thatentities will be seen in a bucket and therefore they appear less rare.Picking the ideal bucket span depends on the characteristics of the datawith shorter bucket spans typically being measured in minutes, not hours.* To model rare data, a learning period of at least 20 buckets is requiredfor typical data.====The {ml-features} include the following rare functions:* <<ml-rare,`rare`>>* <<ml-freq-rare,`freq_rare`>>[discrete][[ml-rare]]== RareThe `rare` function detects values that occur rarely in time or rarely for apopulation. It detects anomalies according to the number of distinct rare values.This function supports the following properties:* `by_field_name` (required)* `over_field_name` (optional)* `partition_field_name` (optional)For more information about those properties, see the{ref}/ml-put-job.html#ml-put-job-request-body[create {anomaly-jobs} API]..Example 1: Analyzing status codes with the rare function[source,js]--------------------------------------------------{  "function" : "rare",  "by_field_name" : "status"}--------------------------------------------------// NOTCONSOLEIf you use this `rare` function in a detector in your {anomaly-job}, it detectsvalues that are rare in time. It models status codes that occur over time anddetects when rare status codes occur compared to the past. For example, you candetect status codes in a web access log that have never (or rarely) occurredbefore..Example 2: Analyzing status codes in a population with the rare function[source,js]--------------------------------------------------{  "function" : "rare",  "by_field_name" : "status",  "over_field_name" : "clientip"}--------------------------------------------------// NOTCONSOLEIf you use this `rare` function in a detector in your {anomaly-job}, it detectsvalues that are rare in a population. It models status code and client IPinteractions that occur. It defines a rare status code as one that occurs forfew client IP values compared to the population. It detects client IP valuesthat experience one or more distinct rare status codes compared to thepopulation. For example in a web access log, a `clientip` that experiences thehighest number of different rare status codes compared to the population isregarded as highly anomalous. This analysis is based on the number of differentstatus code values, not the count of occurrences.NOTE: To define a status code as rare the {ml-features} look at the numberof distinct status codes that occur, not the number of times the status codeoccurs. If a single client IP experiences a single unique status code, thisis rare, even if it occurs for that client IP in every bucket.[discrete][[ml-freq-rare]]== Freq_rareThe `freq_rare` function detects values that occur rarely for a population.It detects anomalies according to the number of times (frequency) that rarevalues occur.This function supports the following properties:* `by_field_name` (required)* `over_field_name` (required)* `partition_field_name` (optional)For more information about those properties, see the{ref}/ml-put-job.html#ml-put-job-request-body[create {anomaly-jobs} API]..Example 3: Analyzing URI values in a population with the freq_rare function[source,js]--------------------------------------------------{  "function" : "freq_rare",  "by_field_name" : "uri",  "over_field_name" : "clientip"}--------------------------------------------------// NOTCONSOLEIf you use this `freq_rare` function in a detector in your {anomaly-job}, itdetects values that are frequently rare in a population. It models URI paths andclient IP interactions that occur. It defines a rare URI path as one that isvisited by few client IP values compared to the population. It detects theclient IP values that experience many interactions with rare URI paths comparedto the population. For example in a web access log, a client IP that visitsone or more rare URI paths many times compared to the population is regarded ashighly anomalous. This analysis is based on the count of interactions with rareURI paths, not the number of different URI path values.NOTE: Defining a URI path as rare happens the same way as you can see in the case of the status codes above: the analytics consider the number of distinct values that occur and not the number of times the URI path occurs. If a single client IP visits a single unique URI path, this is rare, even if itoccurs for that client IP in every bucket.
 |