detector-custom-rules.asciidoc 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238
  1. [role="xpack"]
  2. [[ml-configuring-detector-custom-rules]]
  3. === Customizing detectors with custom rules
  4. <<ml-rules,Custom rules>> enable you to change the behavior of anomaly
  5. detectors based on domain-specific knowledge.
  6. Custom rules describe _when_ a detector should take a certain _action_ instead
  7. of following its default behavior. To specify the _when_ a rule uses
  8. a `scope` and `conditions`. You can think of `scope` as the categorical
  9. specification of a rule, while `conditions` are the numerical part.
  10. A rule can have a scope, one or more conditions, or a combination of
  11. scope and conditions.
  12. Let us see how those can be configured by examples.
  13. ==== Specifying custom rule scope
  14. Let us assume we are configuring an {anomaly-job} in order to detect DNS data
  15. exfiltration. Our data contain fields "subdomain" and "highest_registered_domain".
  16. We can use a detector that looks like
  17. `high_info_content(subdomain) over highest_registered_domain`. If we run such a
  18. job, it is possible that we discover a lot of anomalies on frequently used
  19. domains that we have reasons to trust. As security analysts, we are not
  20. interested in such anomalies. Ideally, we could instruct the detector to skip
  21. results for domains that we consider safe. Using a rule with a scope allows us
  22. to achieve this.
  23. First, we need to create a list of our safe domains. Those lists are called
  24. _filters_ in {ml}. Filters can be shared across {anomaly-jobs}.
  25. We create our filter using the {ref}/ml-put-filter.html[put filter API]:
  26. [source,js]
  27. ----------------------------------
  28. PUT _ml/filters/safe_domains
  29. {
  30. "description": "Our list of safe domains",
  31. "items": ["safe.com", "trusted.com"]
  32. }
  33. ----------------------------------
  34. // CONSOLE
  35. // TEST[skip:needs-licence]
  36. Now, we can create our {anomaly-job} specifying a scope that uses the
  37. `safe_domains` filter for the `highest_registered_domain` field:
  38. [source,js]
  39. ----------------------------------
  40. PUT _ml/anomaly_detectors/dns_exfiltration_with_rule
  41. {
  42. "analysis_config" : {
  43. "bucket_span":"5m",
  44. "detectors" :[{
  45. "function":"high_info_content",
  46. "field_name": "subdomain",
  47. "over_field_name": "highest_registered_domain",
  48. "custom_rules": [{
  49. "actions": ["skip_result"],
  50. "scope": {
  51. "highest_registered_domain": {
  52. "filter_id": "safe_domains",
  53. "filter_type": "include"
  54. }
  55. }
  56. }]
  57. }]
  58. },
  59. "data_description" : {
  60. "time_field":"timestamp"
  61. }
  62. }
  63. ----------------------------------
  64. // CONSOLE
  65. // TEST[skip:needs-licence]
  66. As time advances and we see more data and more results, we might encounter new
  67. domains that we want to add in the filter. We can do that by using the
  68. {ref}/ml-update-filter.html[update filter API]:
  69. [source,js]
  70. ----------------------------------
  71. POST _ml/filters/safe_domains/_update
  72. {
  73. "add_items": ["another-safe.com"]
  74. }
  75. ----------------------------------
  76. // CONSOLE
  77. // TEST[skip:setup:ml_filter_safe_domains]
  78. Note that we can use any of the `partition_field_name`, `over_field_name`, or
  79. `by_field_name` fields in the `scope`.
  80. In the following example we scope multiple fields:
  81. [source,js]
  82. ----------------------------------
  83. PUT _ml/anomaly_detectors/scoping_multiple_fields
  84. {
  85. "analysis_config" : {
  86. "bucket_span":"5m",
  87. "detectors" :[{
  88. "function":"count",
  89. "partition_field_name": "my_partition",
  90. "over_field_name": "my_over",
  91. "by_field_name": "my_by",
  92. "custom_rules": [{
  93. "actions": ["skip_result"],
  94. "scope": {
  95. "my_partition": {
  96. "filter_id": "filter_1"
  97. },
  98. "my_over": {
  99. "filter_id": "filter_2"
  100. },
  101. "my_by": {
  102. "filter_id": "filter_3"
  103. }
  104. }
  105. }]
  106. }]
  107. },
  108. "data_description" : {
  109. "time_field":"timestamp"
  110. }
  111. }
  112. ----------------------------------
  113. // CONSOLE
  114. // TEST[skip:needs-licence]
  115. Such a detector will skip results when the values of all 3 scoped fields
  116. are included in the referenced filters.
  117. ==== Specifying custom rule conditions
  118. Imagine a detector that looks for anomalies in CPU utilization.
  119. Given a machine that is idle for long enough, small movement in CPU could
  120. result in anomalous results where the `actual` value is quite small, for
  121. example, 0.02. Given our knowledge about how CPU utilization behaves we might
  122. determine that anomalies with such small actual values are not interesting for
  123. investigation.
  124. Let us now configure an {anomaly-job} with a rule that will skip results where
  125. CPU utilization is less than 0.20.
  126. [source,js]
  127. ----------------------------------
  128. PUT _ml/anomaly_detectors/cpu_with_rule
  129. {
  130. "analysis_config" : {
  131. "bucket_span":"5m",
  132. "detectors" :[{
  133. "function":"high_mean",
  134. "field_name": "cpu_utilization",
  135. "custom_rules": [{
  136. "actions": ["skip_result"],
  137. "conditions": [
  138. {
  139. "applies_to": "actual",
  140. "operator": "lt",
  141. "value": 0.20
  142. }
  143. ]
  144. }]
  145. }]
  146. },
  147. "data_description" : {
  148. "time_field":"timestamp"
  149. }
  150. }
  151. ----------------------------------
  152. // CONSOLE
  153. // TEST[skip:needs-licence]
  154. When there are multiple conditions they are combined with a logical `and`.
  155. This is useful when we want the rule to apply to a range. We simply create
  156. a rule with two conditions, one for each end of the desired range.
  157. Here is an example where a count detector will skip results when the count
  158. is greater than 30 and less than 50:
  159. [source,js]
  160. ----------------------------------
  161. PUT _ml/anomaly_detectors/rule_with_range
  162. {
  163. "analysis_config" : {
  164. "bucket_span":"5m",
  165. "detectors" :[{
  166. "function":"count",
  167. "custom_rules": [{
  168. "actions": ["skip_result"],
  169. "conditions": [
  170. {
  171. "applies_to": "actual",
  172. "operator": "gt",
  173. "value": 30
  174. },
  175. {
  176. "applies_to": "actual",
  177. "operator": "lt",
  178. "value": 50
  179. }
  180. ]
  181. }]
  182. }]
  183. },
  184. "data_description" : {
  185. "time_field":"timestamp"
  186. }
  187. }
  188. ----------------------------------
  189. // CONSOLE
  190. // TEST[skip:needs-licence]
  191. ==== Custom rules in the life-cycle of a job
  192. Custom rules only affect results created after the rules were applied.
  193. Let us imagine that we have configured an {anomaly-job} and it has been running
  194. for some time. After observing its results we decide that we can employ
  195. rules in order to get rid of some uninteresting results. We can use
  196. the {ref}/ml-update-job.html[update {anomaly-job} API] to do so. However, the
  197. rule we added will only be in effect for any results created from the moment we
  198. added the rule onwards. Past results will remain unaffected.
  199. ==== Using custom rules vs. filtering data
  200. It might appear like using rules is just another way of filtering the data
  201. that feeds into an {anomaly-job}. For example, a rule that skips results when
  202. the partition field value is in a filter sounds equivalent to having a query
  203. that filters out such documents. But it is not. There is a fundamental
  204. difference. When the data is filtered before reaching a job it is as if they
  205. never existed for the job. With rules, the data still reaches the job and
  206. affects its behavior (depending on the rule actions).
  207. For example, a rule with the `skip_result` action means all data will still
  208. be modeled. On the other hand, a rule with the `skip_model_update` action means
  209. results will still be created even though the model will not be updated by
  210. data matched by a rule.