evaluate-dfanalytics.asciidoc 3.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173
  1. [role="xpack"]
  2. [testenv="platinum"]
  3. [[evaluate-dfanalytics]]
  4. === Evaluate {dfanalytics} API
  5. [subs="attributes"]
  6. ++++
  7. <titleabbrev>Evaluate {dfanalytics}</titleabbrev>
  8. ++++
  9. Evaluates the {dfanalytics} for an annotated index.
  10. experimental[]
  11. [[ml-evaluate-dfanalytics-request]]
  12. ==== {api-request-title}
  13. `POST _ml/data_frame/_evaluate`
  14. [[ml-evaluate-dfanalytics-prereq]]
  15. ==== {api-prereq-title}
  16. * You must have `monitor_ml` privilege to use this API. For more
  17. information, see {stack-ov}/security-privileges.html[Security privileges] and
  18. {stack-ov}/built-in-roles.html[Built-in roles].
  19. [[ml-evaluate-dfanalytics-desc]]
  20. ==== {api-description-title}
  21. The API packages together commonly used evaluation metrics for various types of
  22. machine learning features. This has been designed for use on indexes created by
  23. {dfanalytics}. Evaluation requires both a ground truth field and an analytics
  24. result field to be present.
  25. [[ml-evaluate-dfanalytics-request-body]]
  26. ==== {api-request-body-title}
  27. `index`::
  28. (Required, object) Defines the `index` in which the evaluation will be
  29. performed.
  30. `query`::
  31. (Optional, object) A query clause that retrieves a subset of data from the
  32. source index. See <<query-dsl>>.
  33. `evaluation`::
  34. (Required, object) Defines the type of evaluation you want to perform. See
  35. <<ml-evaluate-dfanalytics-resources>>.
  36. +
  37. --
  38. Available evaluation types:
  39. * `binary_soft_classification`
  40. * `regression`
  41. --
  42. ////
  43. [[ml-evaluate-dfanalytics-results]]
  44. ==== {api-response-body-title}
  45. `binary_soft_classification`::
  46. (object) If you chose to do binary soft classification, the API returns the
  47. following evaluation metrics:
  48. `auc_roc`::: TBD
  49. `confusion_matrix`::: TBD
  50. `precision`::: TBD
  51. `recall`::: TBD
  52. ////
  53. [[ml-evaluate-dfanalytics-example]]
  54. ==== {api-examples-title}
  55. ===== Binary soft classification
  56. [source,console]
  57. --------------------------------------------------
  58. POST _ml/data_frame/_evaluate
  59. {
  60. "index": "my_analytics_dest_index",
  61. "evaluation": {
  62. "binary_soft_classification": {
  63. "actual_field": "is_outlier",
  64. "predicted_probability_field": "ml.outlier_score"
  65. }
  66. }
  67. }
  68. --------------------------------------------------
  69. // TEST[skip:TBD]
  70. The API returns the following results:
  71. [source,console-result]
  72. ----
  73. {
  74. "binary_soft_classification": {
  75. "auc_roc": {
  76. "score": 0.92584757746414444
  77. },
  78. "confusion_matrix": {
  79. "0.25": {
  80. "tp": 5,
  81. "fp": 9,
  82. "tn": 204,
  83. "fn": 5
  84. },
  85. "0.5": {
  86. "tp": 1,
  87. "fp": 5,
  88. "tn": 208,
  89. "fn": 9
  90. },
  91. "0.75": {
  92. "tp": 0,
  93. "fp": 4,
  94. "tn": 209,
  95. "fn": 10
  96. }
  97. },
  98. "precision": {
  99. "0.25": 0.35714285714285715,
  100. "0.5": 0.16666666666666666,
  101. "0.75": 0
  102. },
  103. "recall": {
  104. "0.25": 0.5,
  105. "0.5": 0.1,
  106. "0.75": 0
  107. }
  108. }
  109. }
  110. ----
  111. ===== {regression-cap}
  112. [source,console]
  113. --------------------------------------------------
  114. POST _ml/data_frame/_evaluate
  115. {
  116. "index": "house_price_predictions", <1>
  117. "query": {
  118. "bool": {
  119. "filter": [
  120. { "term": { "ml.is_training": false } } <2>
  121. ]
  122. }
  123. },
  124. "evaluation": {
  125. "regression": {
  126. "actual_field": "price", <3>
  127. "predicted_field": "ml.price_prediction", <4>
  128. "metrics": {
  129. "r_squared": {},
  130. "mean_squared_error": {}
  131. }
  132. }
  133. }
  134. }
  135. --------------------------------------------------
  136. // TEST[skip:TBD]
  137. <1> The output destination index from a {dfanalytics} {reganalysis}.
  138. <2> In this example, a test/train split (`training_percent`) was defined for the
  139. {reganalysis}. This query limits evaluation to be performed on the test split
  140. only.
  141. <3> The ground truth value for the actual house price. This is required in order
  142. to evaluate results.
  143. <4> The predicted value for house price calculated by the {reganalysis}.