dfanalyticsresources.asciidoc 4.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127
  1. [role="xpack"]
  2. [testenv="platinum"]
  3. [[ml-dfanalytics-resources]]
  4. === {dfanalytics-cap} job resources
  5. {dfanalytics-cap} resources relate to APIs such as <<put-dfanalytics>> and
  6. <<get-dfanalytics>>.
  7. [discrete]
  8. [[ml-dfanalytics-properties]]
  9. ==== {api-definitions-title}
  10. `analysis`::
  11. (object) The type of analysis that is performed on the `source`. For example:
  12. `outlier_detection`. For more information, see <<dfanalytics-types>>.
  13. `analyzed_fields`::
  14. (object) You can specify both `includes` and/or `excludes` patterns. If
  15. `analyzed_fields` is not set, only the relevant fields will be included. For
  16. example all the numeric fields for {oldetection}.
  17. `analyzed_fields.includes`:::
  18. (array) An array of strings that defines the fields that will be included in
  19. the analysis.
  20. `analyzed_fields.excludes`:::
  21. (array) An array of strings that defines the fields that will be excluded
  22. from the analysis.
  23. [source,js]
  24. --------------------------------------------------
  25. PUT _ml/data_frame/analytics/loganalytics
  26. {
  27. "source": {
  28. "index": "logdata"
  29. },
  30. "dest": {
  31. "index": "logdata_out"
  32. },
  33. "analysis": {
  34. "outlier_detection": {
  35. }
  36. },
  37. "analyzed_fields": {
  38. "includes": [ "request.bytes", "response.counts.error" ],
  39. "excludes": [ "source.geo" ]
  40. }
  41. }
  42. --------------------------------------------------
  43. // CONSOLE
  44. // TEST[setup:setup_logdata]
  45. `description`::
  46. (Optional, string) A description of the job.
  47. `dest`::
  48. (object) The destination configuration of the analysis.
  49. `index`:::
  50. (Required, string) Defines the _destination index_ to store the results of
  51. the {dfanalytics-job}.
  52. `results_field`:::
  53. (Optional, string) Defines the name of the field in which to store the
  54. results of the analysis. Default to `ml`.
  55. `id`::
  56. (string) The unique identifier for the {dfanalytics-job}. This identifier can
  57. contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and
  58. underscores. It must start and end with alphanumeric characters. This property
  59. is informational; you cannot change the identifier for existing jobs.
  60. `model_memory_limit`::
  61. (string) The approximate maximum amount of memory resources that are
  62. permitted for analytical processing. The default value for {dfanalytics-jobs}
  63. is `1gb`. If your `elasticsearch.yml` file contains an
  64. `xpack.ml.max_model_memory_limit` setting, an error occurs when you try to
  65. create {dfanalytics-jobs} that have `model_memory_limit` values greater than
  66. that setting. For more information, see <<ml-settings>>.
  67. `source`::
  68. (object) The source configuration consisting an `index` and optionally a
  69. `query` object.
  70. `index`:::
  71. (Required, string or array) Index or indices on which to perform the
  72. analysis. It can be a single index or index pattern as well as an array of
  73. indices or patterns.
  74. `query`:::
  75. (Optional, object) The {es} query domain-specific language
  76. (<<query-dsl,DSL>>). This value corresponds to the query object in an {es}
  77. search POST body. All the options that are supported by {es} can be used,
  78. as this object is passed verbatim to {es}. By default, this property has
  79. the following value: `{"match_all": {}}`.
  80. [[dfanalytics-types]]
  81. ==== Analysis objects
  82. {dfanalytics-cap} resources contain `analysis` objects. For example, when you
  83. create a {dfanalytics-job}, you must define the type of analysis it performs.
  84. Currently, `outlier_detection` is the only available type of analysis, however,
  85. other types will be added, for example `regression`.
  86. [discrete]
  87. [[oldetection-resources]]
  88. ==== {oldetection-cap} configuration objects
  89. An {oldetection} configuration object has the following properties:
  90. `n_neighbors`::
  91. (integer) Defines the value for how many nearest neighbors each method of
  92. {oldetection} will use to calculate its {olscore}. When the value is
  93. not set, the system will dynamically detect an appropriate value.
  94. `method`::
  95. (string) Sets the method that {oldetection} uses. If the method is not set
  96. {oldetection} uses an ensemble of different methods and normalises and
  97. combines their individual {olscores} to obtain the overall {olscore}. We
  98. recommend to use the ensemble method. Available methods are `lof`, `ldof`,
  99. `distance_kth_nn`, `distance_knn`.
  100. `feature_influence_threshold`::
  101. (double) The minimum {olscore} that a document needs to have in order to
  102. calculate its {fiscore}.
  103. Value range: 0-1 (`0.1` by default).