detector-custom-rules.asciidoc 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237
  1. [role="xpack"]
  2. [[ml-configuring-detector-custom-rules]]
  3. === Customizing detectors with custom rules
  4. <<ml-rules,Custom rules>> enable you to change the behavior of anomaly
  5. detectors based on domain-specific knowledge.
  6. Custom rules describe _when_ a detector should take a certain _action_ instead
  7. of following its default behavior. To specify the _when_ a rule uses
  8. a `scope` and `conditions`. You can think of `scope` as the categorical
  9. specification of a rule, while `conditions` are the numerical part.
  10. A rule can have a scope, one or more conditions, or a combination of
  11. scope and conditions.
  12. Let us see how those can be configured by examples.
  13. ==== Specifying custom rule scope
  14. Let us assume we are configuring a job in order to detect DNS data exfiltration.
  15. Our data contain fields "subdomain" and "highest_registered_domain".
  16. We can use a detector that looks like `high_info_content(subdomain) over highest_registered_domain`.
  17. If we run such a job it is possible that we discover a lot of anomalies on
  18. frequently used domains that we have reasons to trust. As security analysts, we
  19. are not interested in such anomalies. Ideally, we could instruct the detector to
  20. skip results for domains that we consider safe. Using a rule with a scope allows
  21. us to achieve this.
  22. First, we need to create a list of our safe domains. Those lists are called
  23. _filters_ in {ml}. Filters can be shared across jobs.
  24. We create our filter using the {ref}/ml-put-filter.html[put filter API]:
  25. [source,js]
  26. ----------------------------------
  27. PUT _ml/filters/safe_domains
  28. {
  29. "description": "Our list of safe domains",
  30. "items": ["safe.com", "trusted.com"]
  31. }
  32. ----------------------------------
  33. // CONSOLE
  34. // TEST[skip:needs-licence]
  35. Now, we can create our job specifying a scope that uses the `safe_domains`
  36. filter for the `highest_registered_domain` field:
  37. [source,js]
  38. ----------------------------------
  39. PUT _ml/anomaly_detectors/dns_exfiltration_with_rule
  40. {
  41. "analysis_config" : {
  42. "bucket_span":"5m",
  43. "detectors" :[{
  44. "function":"high_info_content",
  45. "field_name": "subdomain",
  46. "over_field_name": "highest_registered_domain",
  47. "custom_rules": [{
  48. "actions": ["skip_result"],
  49. "scope": {
  50. "highest_registered_domain": {
  51. "filter_id": "safe_domains",
  52. "filter_type": "include"
  53. }
  54. }
  55. }]
  56. }]
  57. },
  58. "data_description" : {
  59. "time_field":"timestamp"
  60. }
  61. }
  62. ----------------------------------
  63. // CONSOLE
  64. // TEST[skip:needs-licence]
  65. As time advances and we see more data and more results, we might encounter new
  66. domains that we want to add in the filter. We can do that by using the
  67. {ref}/ml-update-filter.html[update filter API]:
  68. [source,js]
  69. ----------------------------------
  70. POST _ml/filters/safe_domains/_update
  71. {
  72. "add_items": ["another-safe.com"]
  73. }
  74. ----------------------------------
  75. // CONSOLE
  76. // TEST[skip:setup:ml_filter_safe_domains]
  77. Note that we can use any of the `partition_field_name`, `over_field_name`, or
  78. `by_field_name` fields in the `scope`.
  79. In the following example we scope multiple fields:
  80. [source,js]
  81. ----------------------------------
  82. PUT _ml/anomaly_detectors/scoping_multiple_fields
  83. {
  84. "analysis_config" : {
  85. "bucket_span":"5m",
  86. "detectors" :[{
  87. "function":"count",
  88. "partition_field_name": "my_partition",
  89. "over_field_name": "my_over",
  90. "by_field_name": "my_by",
  91. "custom_rules": [{
  92. "actions": ["skip_result"],
  93. "scope": {
  94. "my_partition": {
  95. "filter_id": "filter_1"
  96. },
  97. "my_over": {
  98. "filter_id": "filter_2"
  99. },
  100. "my_by": {
  101. "filter_id": "filter_3"
  102. }
  103. }
  104. }]
  105. }]
  106. },
  107. "data_description" : {
  108. "time_field":"timestamp"
  109. }
  110. }
  111. ----------------------------------
  112. // CONSOLE
  113. // TEST[skip:needs-licence]
  114. Such a detector will skip results when the values of all 3 scoped fields
  115. are included in the referenced filters.
  116. ==== Specifying custom rule conditions
  117. Imagine a detector that looks for anomalies in CPU utilization.
  118. Given a machine that is idle for long enough, small movement in CPU could
  119. result in anomalous results where the `actual` value is quite small, for
  120. example, 0.02. Given our knowledge about how CPU utilization behaves we might
  121. determine that anomalies with such small actual values are not interesting for
  122. investigation.
  123. Let us now configure a job with a rule that will skip results where CPU
  124. utilization is less than 0.20.
  125. [source,js]
  126. ----------------------------------
  127. PUT _ml/anomaly_detectors/cpu_with_rule
  128. {
  129. "analysis_config" : {
  130. "bucket_span":"5m",
  131. "detectors" :[{
  132. "function":"high_mean",
  133. "field_name": "cpu_utilization",
  134. "custom_rules": [{
  135. "actions": ["skip_result"],
  136. "conditions": [
  137. {
  138. "applies_to": "actual",
  139. "operator": "lt",
  140. "value": 0.20
  141. }
  142. ]
  143. }]
  144. }]
  145. },
  146. "data_description" : {
  147. "time_field":"timestamp"
  148. }
  149. }
  150. ----------------------------------
  151. // CONSOLE
  152. // TEST[skip:needs-licence]
  153. When there are multiple conditions they are combined with a logical `and`.
  154. This is useful when we want the rule to apply to a range. We simply create
  155. a rule with two conditions, one for each end of the desired range.
  156. Here is an example where a count detector will skip results when the count
  157. is greater than 30 and less than 50:
  158. [source,js]
  159. ----------------------------------
  160. PUT _ml/anomaly_detectors/rule_with_range
  161. {
  162. "analysis_config" : {
  163. "bucket_span":"5m",
  164. "detectors" :[{
  165. "function":"count",
  166. "custom_rules": [{
  167. "actions": ["skip_result"],
  168. "conditions": [
  169. {
  170. "applies_to": "actual",
  171. "operator": "gt",
  172. "value": 30
  173. },
  174. {
  175. "applies_to": "actual",
  176. "operator": "lt",
  177. "value": 50
  178. }
  179. ]
  180. }]
  181. }]
  182. },
  183. "data_description" : {
  184. "time_field":"timestamp"
  185. }
  186. }
  187. ----------------------------------
  188. // CONSOLE
  189. // TEST[skip:needs-licence]
  190. ==== Custom rules in the life-cycle of a job
  191. Custom rules only affect results created after the rules were applied.
  192. Let us imagine that we have configured a job and it has been running
  193. for some time. After observing its results we decide that we can employ
  194. rules in order to get rid of some uninteresting results. We can use
  195. the {ref}/ml-update-job.html[update job API] to do so. However, the rule we
  196. added will only be in effect for any results created from the moment we added
  197. the rule onwards. Past results will remain unaffected.
  198. ==== Using custom rules VS filtering data
  199. It might appear like using rules is just another way of filtering the data
  200. that feeds into a job. For example, a rule that skips results when the
  201. partition field value is in a filter sounds equivalent to having a query
  202. that filters out such documents. But it is not. There is a fundamental
  203. difference. When the data is filtered before reaching a job it is as if they
  204. never existed for the job. With rules, the data still reaches the job and
  205. affects its behavior (depending on the rule actions).
  206. For example, a rule with the `skip_result` action means all data will still
  207. be modeled. On the other hand, a rule with the `skip_model_update` action means
  208. results will still be created even though the model will not be updated by
  209. data matched by a rule.