evaluate-dfanalytics.asciidoc 8.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340
  1. [role="xpack"]
  2. [testenv="platinum"]
  3. [[evaluate-dfanalytics]]
  4. === Evaluate {dfanalytics} API
  5. [subs="attributes"]
  6. ++++
  7. <titleabbrev>Evaluate {dfanalytics}</titleabbrev>
  8. ++++
  9. Evaluates the {dfanalytics} for an annotated index.
  10. experimental[]
  11. [[ml-evaluate-dfanalytics-request]]
  12. ==== {api-request-title}
  13. `POST _ml/data_frame/_evaluate`
  14. [[ml-evaluate-dfanalytics-prereq]]
  15. ==== {api-prereq-title}
  16. * You must have `monitor_ml` privilege to use this API. For more
  17. information, see <<security-privileges>> and <<built-in-roles>>.
  18. [[ml-evaluate-dfanalytics-desc]]
  19. ==== {api-description-title}
  20. The API packages together commonly used evaluation metrics for various types of
  21. machine learning features. This has been designed for use on indexes created by
  22. {dfanalytics}. Evaluation requires both a ground truth field and an analytics
  23. result field to be present.
  24. [[ml-evaluate-dfanalytics-request-body]]
  25. ==== {api-request-body-title}
  26. `index`::
  27. (Required, object) Defines the `index` in which the evaluation will be
  28. performed.
  29. `query`::
  30. (Optional, object) A query clause that retrieves a subset of data from the
  31. source index. See <<query-dsl>>.
  32. `evaluation`::
  33. (Required, object) Defines the type of evaluation you want to perform. See
  34. <<ml-evaluate-dfanalytics-resources>>.
  35. +
  36. --
  37. Available evaluation types:
  38. * `binary_soft_classification`
  39. * `regression`
  40. * `classification`
  41. --
  42. ////
  43. [[ml-evaluate-dfanalytics-results]]
  44. ==== {api-response-body-title}
  45. `binary_soft_classification`::
  46. (object) If you chose to do binary soft classification, the API returns the
  47. following evaluation metrics:
  48. `auc_roc`::: TBD
  49. `confusion_matrix`::: TBD
  50. `precision`::: TBD
  51. `recall`::: TBD
  52. ////
  53. [[ml-evaluate-dfanalytics-example]]
  54. ==== {api-examples-title}
  55. [[ml-evaluate-binary-soft-class-example]]
  56. ===== Binary soft classification
  57. [source,console]
  58. --------------------------------------------------
  59. POST _ml/data_frame/_evaluate
  60. {
  61. "index": "my_analytics_dest_index",
  62. "evaluation": {
  63. "binary_soft_classification": {
  64. "actual_field": "is_outlier",
  65. "predicted_probability_field": "ml.outlier_score"
  66. }
  67. }
  68. }
  69. --------------------------------------------------
  70. // TEST[skip:TBD]
  71. The API returns the following results:
  72. [source,console-result]
  73. ----
  74. {
  75. "binary_soft_classification": {
  76. "auc_roc": {
  77. "score": 0.92584757746414444
  78. },
  79. "confusion_matrix": {
  80. "0.25": {
  81. "tp": 5,
  82. "fp": 9,
  83. "tn": 204,
  84. "fn": 5
  85. },
  86. "0.5": {
  87. "tp": 1,
  88. "fp": 5,
  89. "tn": 208,
  90. "fn": 9
  91. },
  92. "0.75": {
  93. "tp": 0,
  94. "fp": 4,
  95. "tn": 209,
  96. "fn": 10
  97. }
  98. },
  99. "precision": {
  100. "0.25": 0.35714285714285715,
  101. "0.5": 0.16666666666666666,
  102. "0.75": 0
  103. },
  104. "recall": {
  105. "0.25": 0.5,
  106. "0.5": 0.1,
  107. "0.75": 0
  108. }
  109. }
  110. }
  111. ----
  112. [[ml-evaluate-regression-example]]
  113. ===== {regression-cap}
  114. [source,console]
  115. --------------------------------------------------
  116. POST _ml/data_frame/_evaluate
  117. {
  118. "index": "house_price_predictions", <1>
  119. "query": {
  120. "bool": {
  121. "filter": [
  122. { "term": { "ml.is_training": false } } <2>
  123. ]
  124. }
  125. },
  126. "evaluation": {
  127. "regression": {
  128. "actual_field": "price", <3>
  129. "predicted_field": "ml.price_prediction", <4>
  130. "metrics": {
  131. "r_squared": {},
  132. "mean_squared_error": {}
  133. }
  134. }
  135. }
  136. }
  137. --------------------------------------------------
  138. // TEST[skip:TBD]
  139. <1> The output destination index from a {dfanalytics} {reganalysis}.
  140. <2> In this example, a test/train split (`training_percent`) was defined for the
  141. {reganalysis}. This query limits evaluation to be performed on the test split
  142. only.
  143. <3> The ground truth value for the actual house price. This is required in order
  144. to evaluate results.
  145. <4> The predicted value for house price calculated by the {reganalysis}.
  146. The following example calculates the training error:
  147. [source,console]
  148. --------------------------------------------------
  149. POST _ml/data_frame/_evaluate
  150. {
  151. "index": "student_performance_mathematics_reg",
  152. "query": {
  153. "term": {
  154. "ml.is_training": {
  155. "value": true <1>
  156. }
  157. }
  158. },
  159. "evaluation": {
  160. "regression": {
  161. "actual_field": "G3", <2>
  162. "predicted_field": "ml.G3_prediction", <3>
  163. "metrics": {
  164. "r_squared": {},
  165. "mean_squared_error": {}
  166. }
  167. }
  168. }
  169. }
  170. --------------------------------------------------
  171. // TEST[skip:TBD]
  172. <1> In this example, a test/train split (`training_percent`) was defined for the
  173. {reganalysis}. This query limits evaluation to be performed on the train split
  174. only. It means that a training error will be calculated.
  175. <2> The field that contains the ground truth value for the actual student
  176. performance. This is required in order to evaluate results.
  177. <3> The field that contains the predicted value for student performance
  178. calculated by the {reganalysis}.
  179. The next example calculates the testing error. The only difference compared with
  180. the previous example is that `ml.is_training` is set to `false` this time, so
  181. the query excludes the train split from the evaluation.
  182. [source,console]
  183. --------------------------------------------------
  184. POST _ml/data_frame/_evaluate
  185. {
  186. "index": "student_performance_mathematics_reg",
  187. "query": {
  188. "term": {
  189. "ml.is_training": {
  190. "value": false <1>
  191. }
  192. }
  193. },
  194. "evaluation": {
  195. "regression": {
  196. "actual_field": "G3", <2>
  197. "predicted_field": "ml.G3_prediction", <3>
  198. "metrics": {
  199. "r_squared": {},
  200. "mean_squared_error": {}
  201. }
  202. }
  203. }
  204. }
  205. --------------------------------------------------
  206. // TEST[skip:TBD]
  207. <1> In this example, a test/train split (`training_percent`) was defined for the
  208. {reganalysis}. This query limits evaluation to be performed on the test split
  209. only. It means that a testing error will be calculated.
  210. <2> The field that contains the ground truth value for the actual student
  211. performance. This is required in order to evaluate results.
  212. <3> The field that contains the predicted value for student performance
  213. calculated by the {reganalysis}.
  214. [[ml-evaluate-classification-example]]
  215. ===== {classification-cap}
  216. [source,console]
  217. --------------------------------------------------
  218. POST _ml/data_frame/_evaluate
  219. {
  220. "index": "animal_classification",
  221. "evaluation": {
  222. "classification": { <1>
  223. "actual_field": "animal_class", <2>
  224. "predicted_field": "ml.animal_class_prediction.keyword", <3>
  225. "metrics": {
  226. "multiclass_confusion_matrix" : {} <4>
  227. }
  228. }
  229. }
  230. }
  231. --------------------------------------------------
  232. // TEST[skip:TBD]
  233. <1> The evaluation type.
  234. <2> The field that contains the ground truth value for the actual animal
  235. classification. This is required in order to evaluate results.
  236. <3> The field that contains the predicted value for animal classification by
  237. the {classanalysis}. Since the field storing predicted class is dynamically
  238. mapped as text and keyword, you need to add the `.keyword` suffix to the name.
  239. <4> Specifies the metric for the evaluation.
  240. The API returns the following result:
  241. [source,console-result]
  242. --------------------------------------------------
  243. {
  244. "classification" : {
  245. "multiclass_confusion_matrix" : {
  246. "confusion_matrix" : [
  247. {
  248. "actual_class" : "cat", <1>
  249. "actual_class_doc_count" : 12, <2>
  250. "predicted_classes" : [ <3>
  251. {
  252. "predicted_class" : "cat",
  253. "count" : 12 <4>
  254. },
  255. {
  256. "predicted_class" : "dog",
  257. "count" : 0 <5>
  258. }
  259. ],
  260. "other_predicted_class_doc_count" : 0 <6>
  261. },
  262. {
  263. "actual_class" : "dog",
  264. "actual_class_doc_count" : 11,
  265. "predicted_classes" : [
  266. {
  267. "predicted_class" : "dog",
  268. "count" : 7
  269. },
  270. {
  271. "predicted_class" : "cat",
  272. "count" : 4
  273. }
  274. ],
  275. "other_predicted_class_doc_count" : 0
  276. }
  277. ],
  278. "other_actual_class_count" : 0
  279. }
  280. }
  281. }
  282. --------------------------------------------------
  283. <1> The name of the actual class that the analysis tried to predict.
  284. <2> The number of documents in the index that belong to the `actual_class`.
  285. <3> This object contains the list of the predicted classes and the number of
  286. predictions associated with the class.
  287. <4> The number of cats in the dataset that are correctly identified as cats.
  288. <5> The number of cats in the dataset that are incorrectly classified as dogs.
  289. <6> The number of documents that are classified as a class that is not listed as
  290. a `predicted_class`.