| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556 | [role="xpack"][testenv="platinum"][[evaluate-dfanalytics]]= Evaluate {dfanalytics} API[subs="attributes"]++++<titleabbrev>Evaluate {dfanalytics}</titleabbrev>++++Evaluates the {dfanalytics} for an annotated index.beta::[][[ml-evaluate-dfanalytics-request]]== {api-request-title}`POST _ml/data_frame/_evaluate`[[ml-evaluate-dfanalytics-prereq]]== {api-prereq-title}If the {es} {security-features} are enabled, you must have the following privileges:* cluster: `monitor_ml`  For more information, see <<security-privileges>> and {ml-docs-setup-privileges}.[[ml-evaluate-dfanalytics-desc]]== {api-description-title}The API packages together commonly used evaluation metrics for various types of machine learning features. This has been designed for use on indexes created by {dfanalytics}. Evaluation requires both a ground truth field and an analytics result field to be present.[[ml-evaluate-dfanalytics-request-body]]== {api-request-body-title}`evaluation`::(Required, object) Defines the type of evaluation you want to perform.See <<ml-evaluate-dfanalytics-resources>>.+--Available evaluation types:* `outlier_detection`* `regression`* `classification`--`index`::(Required, object) Defines the `index` in which the evaluation will beperformed.`query`::(Optional, object) A query clause that retrieves a subset of data from thesource index. See <<query-dsl>>.[[ml-evaluate-dfanalytics-resources]]== {dfanalytics-cap} evaluation resources[[oldetection-resources]]=== {oldetection-cap} evaluation objects{oldetection-cap} evaluates the results of an {oldetection} analysis which outputs the probability that each document is an outlier.`actual_field`::  (Required, string) The field of the `index` which contains the `ground truth`.   The data type of this field can be boolean or integer. If the data type is   integer, the value has to be either `0` (false) or `1` (true).`predicted_probability_field`::  (Required, string) The field of the `index` that defines the probability of   whether the item belongs to the class in question or not. It's the field that   contains the results of the analysis.`metrics`::  (Optional, object) Specifies the metrics that are used for the evaluation. If   no metrics are specified, the following are returned by default:     * `auc_roc` (`include_curve`: false),   * `precision` (`at`: [0.25, 0.5, 0.75]),   * `recall` (`at`: [0.25, 0.5, 0.75]),   * `confusion_matrix` (`at`: [0.25, 0.5, 0.75]).    `auc_roc`:::    (Optional, object) The AUC ROC (area under the curve of the receiver     operating characteristic) score and optionally the curve. Default value is     {"include_curve": false}.      `confusion_matrix`:::    (Optional, object) Set the different thresholds of the {olscore} at where    the metrics (`tp` - true positive, `fp` - false positive, `tn` - true    negative, `fn` - false negative) are calculated. Default value is    {"at": [0.25, 0.50, 0.75]}.  `precision`:::    (Optional, object) Set the different thresholds of the {olscore} at where     the metric is calculated. Default value is {"at": [0.25, 0.50, 0.75]}.    `recall`:::    (Optional, object) Set the different thresholds of the {olscore} at where     the metric is calculated. Default value is {"at": [0.25, 0.50, 0.75]}.    [[regression-evaluation-resources]]=== {regression-cap} evaluation objects{regression-cap} evaluation evaluates the results of a {regression} analysis which outputs a prediction of values.`actual_field`::  (Required, string) The field of the `index` which contains the `ground truth`.   The data type of this field must be numerical.  `predicted_field`::  (Required, string) The field in the `index` that contains the predicted value,   in other words the results of the {regression} analysis.  `metrics`::  (Optional, object) Specifies the metrics that are used for the evaluation. For   more information on `mse`, `msle`, and `huber`, consult   https://github.com/elastic/examples/tree/master/Machine%20Learning/Regression%20Loss%20Functions[the Jupyter notebook on regression loss functions].  If no metrics are specified, the following are returned by default:     * `mse`,   * `r_squared`,  * `huber` (`delta`: 1.0).   `mse`:::    (Optional, object) Average squared difference between the predicted values     and the actual (`ground truth`) value. For more information, read     {wikipedia}/Mean_squared_error[this wiki article].  `msle`:::    (Optional, object) Average squared difference between the logarithm of the     predicted values and the logarithm of the actual (`ground truth`) value.        `offset`::::      (Optional, double) Defines the transition point at which you switch from       minimizing quadratic error to minimizing quadratic log error. Defaults to       `1`.  `huber`:::    (Optional, object) Pseudo Huber loss function. For more information, read     {wikipedia}/Huber_loss#Pseudo-Huber_loss_function[this wiki article].        `delta`::::      (Optional, double) Approximates 1/2 (prediction - actual)^2^ for values       much less than delta and approximates a straight line with slope delta for       values much larger than delta. Defaults to `1`. Delta needs to be greater       than `0`.  `r_squared`:::    (Optional, object) Proportion of the variance in the dependent variable that     is predictable from the independent variables. For more information, read     {wikipedia}/Coefficient_of_determination[this wiki article].  [[classification-evaluation-resources]]== {classification-cap} evaluation objects{classification-cap} evaluation evaluates the results of a {classanalysis} which outputs a prediction that identifies to which of the classes each document belongs.`actual_field`::  (Required, string) The field of the `index` which contains the `ground truth`.  The data type of this field must be categorical.  `predicted_field`::  (Optional, string) The field in the `index` which contains the predicted value,  in other words the results of the {classanalysis}.`top_classes_field`::  (Optional, string) The field of the `index` which is an array of documents  of the form `{ "class_name": XXX, "class_probability": YYY }`.  This field must be defined as `nested` in the mappings.`metrics`::  (Optional, object) Specifies the metrics that are used for the evaluation. If   no metrics are specificed, the following are returned by default:     * `accuracy`,   * `multiclass_confusion_matrix`,   * `precision`,   * `recall`.  `accuracy`:::    (Optional, object) Accuracy of predictions (per-class and overall).  `auc_roc`:::    (Optional, object) The AUC ROC (area under the curve of the receiver    operating characteristic) score and optionally the curve.    It is calculated for a specific class (provided as "class_name") treated as     positive.    `class_name`::::      (Required, string) Name of the only class that is treated as positive       during AUC ROC calculation. Other classes are treated as negative       ("one-vs-all" strategy). All the evaluated documents must have       `class_name` in the list of their top classes.    `include_curve`::::      (Optional, Boolean) Whether or not the curve should be returned in      addition to the score. Default value is false.  `multiclass_confusion_matrix`:::    (Optional, object) Multiclass confusion matrix.  `precision`:::    (Optional, object) Precision of predictions (per-class and average).  `recall`:::    (Optional, object) Recall of predictions (per-class and average).////[[ml-evaluate-dfanalytics-results]]== {api-response-body-title}`outlier_detection`::  (object) If you chose to do outlier detection, the API returns the  following evaluation metrics:  `auc_roc`::: TBD`confusion_matrix`::: TBD  `precision`::: TBD`recall`::: TBD////[[ml-evaluate-dfanalytics-example]]== {api-examples-title}[[ml-evaluate-oldetection-example]]=== {oldetection-cap}[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{  "index": "my_analytics_dest_index",  "evaluation": {    "outlier_detection": {      "actual_field": "is_outlier",      "predicted_probability_field": "ml.outlier_score"    }  }}--------------------------------------------------// TEST[skip:TBD]The API returns the following results:[source,console-result]----{  "outlier_detection": {    "auc_roc": {      "value": 0.92584757746414444    },    "confusion_matrix": {      "0.25": {          "tp": 5,          "fp": 9,          "tn": 204,          "fn": 5      },      "0.5": {          "tp": 1,          "fp": 5,          "tn": 208,          "fn": 9      },      "0.75": {          "tp": 0,          "fp": 4,          "tn": 209,          "fn": 10      }    },    "precision": {        "0.25": 0.35714285714285715,        "0.5": 0.16666666666666666,        "0.75": 0    },    "recall": {        "0.25": 0.5,        "0.5": 0.1,        "0.75": 0    }  }}----[[ml-evaluate-regression-example]]=== {regression-cap}[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{  "index": "house_price_predictions", <1>  "query": {      "bool": {        "filter": [          { "term":  { "ml.is_training": false } } <2>        ]      }  },  "evaluation": {    "regression": {       "actual_field": "price", <3>      "predicted_field": "ml.price_prediction", <4>      "metrics": {          "r_squared": {},        "mse": {},        "msle": {"offset": 10},        "huber": {"delta": 1.5}      }    }  }}--------------------------------------------------// TEST[skip:TBD]<1> The output destination index from a {dfanalytics} {reganalysis}.<2> In this example, a test/train split (`training_percent`) was defined for the {reganalysis}. This query limits evaluation to be performed on the test split only. <3> The ground truth value for the actual house price. This is required in order to evaluate results.<4> The predicted value for house price calculated by the {reganalysis}.The following example calculates the training error:[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{  "index": "student_performance_mathematics_reg",  "query": {    "term": {      "ml.is_training": {        "value": true <1>      }    }  },  "evaluation": {    "regression": {       "actual_field": "G3", <2>      "predicted_field": "ml.G3_prediction", <3>      "metrics": {          "r_squared": {},        "mse": {},        "msle": {},        "huber": {}      }    }  }}--------------------------------------------------// TEST[skip:TBD]<1> In this example, a test/train split (`training_percent`) was defined for the {reganalysis}. This query limits evaluation to be performed on the train split only. It means that a training error will be calculated.<2> The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results.<3> The field that contains the predicted value for student performance calculated by the {reganalysis}.The next example calculates the testing error. The only difference compared with the previous example is that `ml.is_training` is set to `false` this time, so the query excludes the train split from the evaluation.[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{  "index": "student_performance_mathematics_reg",  "query": {    "term": {      "ml.is_training": {        "value": false <1>      }    }  },  "evaluation": {    "regression": {       "actual_field": "G3", <2>      "predicted_field": "ml.G3_prediction", <3>      "metrics": {          "r_squared": {},        "mse": {},        "msle": {},        "huber": {}      }    }  }}--------------------------------------------------// TEST[skip:TBD]<1> In this example, a test/train split (`training_percent`) was defined for the {reganalysis}. This query limits evaluation to be performed on the test split only. It means that a testing error will be calculated.<2> The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results.<3> The field that contains the predicted value for student performance calculated by the {reganalysis}.[[ml-evaluate-classification-example]]=== {classification-cap}[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{    "index": "animal_classification",   "evaluation": {      "classification": { <1>         "actual_field": "animal_class", <2>         "predicted_field": "ml.animal_class_prediction", <3>         "metrics": {             "multiclass_confusion_matrix" : {} <4>         }      }   }}--------------------------------------------------// TEST[skip:TBD]<1> The evaluation type.<2> The field that contains the ground truth value for the actual animal classification. This is required in order to evaluate results.<3> The field that contains the predicted value for animal classification by the {classanalysis}.<4> Specifies the metric for the evaluation.The API returns the following result:[source,console-result]--------------------------------------------------{   "classification" : {      "multiclass_confusion_matrix" : {         "confusion_matrix" : [         {            "actual_class" : "cat", <1>            "actual_class_doc_count" : 12, <2>            "predicted_classes" : [ <3>              {                "predicted_class" : "cat",                "count" : 12 <4>              },              {                "predicted_class" : "dog",                "count" : 0 <5>              }            ],            "other_predicted_class_doc_count" : 0 <6>          },          {            "actual_class" : "dog",            "actual_class_doc_count" : 11,            "predicted_classes" : [              {                "predicted_class" : "dog",                "count" : 7              },              {                "predicted_class" : "cat",                "count" : 4              }            ],            "other_predicted_class_doc_count" : 0          }        ],        "other_actual_class_count" : 0      }    }  }--------------------------------------------------<1> The name of the actual class that the analysis tried to predict.<2> The number of documents in the index that belong to the `actual_class`.<3> This object contains the list of the predicted classes and the number of predictions associated with the class.<4> The number of cats in the dataset that are correctly identified as cats.<5> The number of cats in the dataset that are incorrectly classified as dogs.<6> The number of documents that are classified as a class that is not listed as a `predicted_class`.[source,console]--------------------------------------------------POST _ml/data_frame/_evaluate{   "index": "animal_classification",   "evaluation": {      "classification": { <1>         "actual_field": "animal_class", <2>         "metrics": {            "auc_roc" : { <3>              "class_name": "dog" <4>            }         }      }   }}--------------------------------------------------// TEST[skip:TBD]<1> The evaluation type.<2> The field that contains the ground truth value for the actual animal classification. This is required in order to evaluate results.<3> Specifies the metric for the evaluation.<4> Specifies the class name that is treated as positive during the evaluation, all the other classes are treated as negative.The API returns the following result:[source,console-result]--------------------------------------------------{  "classification" : {    "auc_roc" : {      "value" : 0.8941788639536681    }  }}--------------------------------------------------// TEST[skip:TBD]
 |