get-category.asciidoc 6.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179
  1. [role="xpack"]
  2. [[ml-get-category]]
  3. = Get categories API
  4. ++++
  5. <titleabbrev>Get categories</titleabbrev>
  6. ++++
  7. Retrieves {anomaly-job} results for one or more categories.
  8. [[ml-get-category-request]]
  9. == {api-request-title}
  10. `GET _ml/anomaly_detectors/<job_id>/results/categories` +
  11. `GET _ml/anomaly_detectors/<job_id>/results/categories/<category_id>`
  12. [[ml-get-category-prereqs]]
  13. == {api-prereq-title}
  14. Requires the `monitor_ml` cluster privilege. This privilege is included in the
  15. `machine_learning_user` built-in role.
  16. [[ml-get-category-desc]]
  17. == {api-description-title}
  18. When `categorization_field_name` is specified in the job configuration, it is
  19. possible to view the definitions of the resulting categories. A category
  20. definition describes the common terms matched and contains examples of matched
  21. values.
  22. The anomaly results from a categorization analysis are available as bucket,
  23. influencer, and record results. For example, the results might indicate that
  24. at 16:45 there was an unusual count of log message category 11. You can then
  25. examine the description and examples of that category. For more information, see
  26. {ml-docs}/ml-configuring-categories.html[Categorizing log messages].
  27. [[ml-get-category-path-parms]]
  28. == {api-path-parms-title}
  29. `<job_id>`::
  30. (Required, string)
  31. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
  32. `<category_id>`::
  33. (Optional, long) Identifier for the category, which is unique in the job. If you
  34. specify neither the category ID nor the `partition_field_value`, the API returns
  35. information about all categories. If you specify only the
  36. `partition_field_value`, it returns information about all categories for the
  37. specified partition.
  38. [[ml-get-category-query-parms]]
  39. == {api-query-parms-title}
  40. `from`::
  41. (Optional, integer) Skips the specified number of categories. Defaults to `0`.
  42. `partition_field_value`::
  43. (Optional, string) Only return categories for the specified partition.
  44. `size`::
  45. (Optional, integer) Specifies the maximum number of categories to obtain.
  46. Defaults to `100`.
  47. [[ml-get-category-request-body]]
  48. == {api-request-body-title}
  49. You can also specify the `partition_field_value` query parameter in the
  50. request body.
  51. `page`::
  52. +
  53. .Properties of `page`
  54. [%collapsible%open]
  55. ====
  56. `from`:::
  57. (Optional, integer) Skips the specified number of categories. Defaults to `0`.
  58. `size`:::
  59. (Optional, integer) Specifies the maximum number of categories to obtain.
  60. Defaults to `100`.
  61. ====
  62. [[ml-get-category-results]]
  63. == {api-response-body-title}
  64. The API returns an array of category objects, which have the following properties:
  65. `category_id`::
  66. (unsigned integer) A unique identifier for the category. `category_id` is unique
  67. at the job level, even when per-partition categorization is enabled.
  68. `examples`::
  69. (array) A list of examples of actual values that matched the category.
  70. `grok_pattern`::
  71. experimental[] (string) A Grok pattern that could be used in {ls} or an ingest
  72. pipeline to extract fields from messages that match the category. This field is
  73. experimental and may be changed or removed in a future release. The Grok
  74. patterns that are found are not optimal, but are often a good starting point for
  75. manual tweaking.
  76. `job_id`::
  77. (string)
  78. include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
  79. `max_matching_length`::
  80. (unsigned integer) The maximum length of the fields that matched the category.
  81. The value is increased by 10% to enable matching for similar fields that have
  82. not been analyzed.
  83. // This doesn't use the shared description because there are
  84. // categorization-specific aspects to its use in this context
  85. `partition_field_name`::
  86. (string) If per-partition categorization is enabled, this property identifies
  87. the field used to segment the categorization. It is not present when
  88. per-partition categorization is disabled.
  89. `partition_field_value`::
  90. (string) If per-partition categorization is enabled, this property identifies
  91. the value of the `partition_field_name` for the category. It is not present when
  92. per-partition categorization is disabled.
  93. `regex`::
  94. (string) A regular expression that is used to search for values that match the
  95. category.
  96. `terms`::
  97. (string) A space separated list of the common tokens that are matched in values
  98. of the category.
  99. `num_matches`::
  100. (long) The number of messages that have been matched by this category. This is
  101. only guaranteed to have the latest accurate count after a job `_flush` or `_close`
  102. `preferred_to_categories`::
  103. (list) A list of `category_id` entries that this current category encompasses.
  104. Any new message that is processed by the categorizer will match against this
  105. category and not any of the categories in this list. This is only guaranteed
  106. to have the latest accurate list of categories after a job `_flush` or `_close`
  107. [[ml-get-category-example]]
  108. == {api-examples-title}
  109. [source,console]
  110. --------------------------------------------------
  111. GET _ml/anomaly_detectors/esxi_log/results/categories
  112. {
  113. "page":{
  114. "size": 1
  115. }
  116. }
  117. --------------------------------------------------
  118. // TEST[skip:todo]
  119. [source,js]
  120. ----
  121. {
  122. "count": 11,
  123. "categories": [
  124. {
  125. "job_id" : "esxi_log",
  126. "category_id" : 1,
  127. "terms" : "Vpxa verbose vpxavpxaInvtVm opID VpxaInvtVmChangeListener Guest DiskInfo Changed",
  128. "regex" : ".*?Vpxa.+?verbose.+?vpxavpxaInvtVm.+?opID.+?VpxaInvtVmChangeListener.+?Guest.+?DiskInfo.+?Changed.*",
  129. "max_matching_length": 154,
  130. "examples" : [
  131. "Oct 19 17:04:44 esxi1.acme.com Vpxa: [3CB3FB90 verbose 'vpxavpxaInvtVm' opID=WFU-33d82c31] [VpxaInvtVmChangeListener] Guest DiskInfo Changed",
  132. "Oct 19 17:04:45 esxi2.acme.com Vpxa: [3CA66B90 verbose 'vpxavpxaInvtVm' opID=WFU-33927856] [VpxaInvtVmChangeListener] Guest DiskInfo Changed",
  133. "Oct 19 17:04:51 esxi1.acme.com Vpxa: [FFDBAB90 verbose 'vpxavpxaInvtVm' opID=WFU-25e0d447] [VpxaInvtVmChangeListener] Guest DiskInfo Changed",
  134. "Oct 19 17:04:58 esxi2.acme.com Vpxa: [FFDDBB90 verbose 'vpxavpxaInvtVm' opID=WFU-bbff0134] [VpxaInvtVmChangeListener] Guest DiskInfo Changed"
  135. ],
  136. "grok_pattern" : ".*?%{SYSLOGTIMESTAMP:timestamp}.+?Vpxa.+?%{BASE16NUM:field}.+?verbose.+?vpxavpxaInvtVm.+?opID.+?VpxaInvtVmChangeListener.+?Guest.+?DiskInfo.+?Changed.*"
  137. }
  138. ]
  139. }
  140. ----