123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555 |
- [role="xpack"]
- [[evaluate-dfanalytics]]
- = Evaluate {dfanalytics} API
- [subs="attributes"]
- ++++
- <titleabbrev>Evaluate {dfanalytics}</titleabbrev>
- ++++
- Evaluates the {dfanalytics} for an annotated index.
- [[ml-evaluate-dfanalytics-request]]
- == {api-request-title}
- `POST _ml/data_frame/_evaluate`
- [[ml-evaluate-dfanalytics-prereq]]
- == {api-prereq-title}
- Requires the following privileges:
- * cluster: `monitor_ml` (the `machine_learning_user` built-in role grants this
- privilege)
- * destination index: `read`
- [[ml-evaluate-dfanalytics-desc]]
- == {api-description-title}
- The API packages together commonly used evaluation metrics for various types of
- machine learning features. This has been designed for use on indexes created by
- {dfanalytics}. Evaluation requires both a ground truth field and an analytics
- result field to be present.
- [[ml-evaluate-dfanalytics-request-body]]
- == {api-request-body-title}
- `evaluation`::
- (Required, object) Defines the type of evaluation you want to perform.
- See <<ml-evaluate-dfanalytics-resources>>.
- +
- --
- Available evaluation types:
- * `outlier_detection`
- * `regression`
- * `classification`
- --
- `index`::
- (Required, object) Defines the `index` in which the evaluation will be
- performed.
- `query`::
- (Optional, object) A query clause that retrieves a subset of data from the
- source index. See <<query-dsl>>.
- [[ml-evaluate-dfanalytics-resources]]
- == {dfanalytics-cap} evaluation resources
- [[oldetection-resources]]
- === {oldetection-cap} evaluation objects
- {oldetection-cap} evaluates the results of an {oldetection} analysis which
- outputs the probability that each document is an outlier.
- `actual_field`::
- (Required, string) The field of the `index` which contains the `ground truth`.
- The data type of this field can be boolean or integer. If the data type is
- integer, the value has to be either `0` (false) or `1` (true).
- `predicted_probability_field`::
- (Required, string) The field of the `index` that defines the probability of
- whether the item belongs to the class in question or not. It's the field that
- contains the results of the analysis.
- `metrics`::
- (Optional, object) Specifies the metrics that are used for the evaluation. If
- no metrics are specified, the following are returned by default:
-
- * `auc_roc` (`include_curve`: false),
- * `precision` (`at`: [0.25, 0.5, 0.75]),
- * `recall` (`at`: [0.25, 0.5, 0.75]),
- * `confusion_matrix` (`at`: [0.25, 0.5, 0.75]).
-
- `auc_roc`:::
- (Optional, object) The AUC ROC (area under the curve of the receiver
- operating characteristic) score and optionally the curve. Default value is
- {"include_curve": false}.
-
- `confusion_matrix`:::
- (Optional, object) Set the different thresholds of the {olscore} at where
- the metrics (`tp` - true positive, `fp` - false positive, `tn` - true
- negative, `fn` - false negative) are calculated. Default value is
- {"at": [0.25, 0.50, 0.75]}.
- `precision`:::
- (Optional, object) Set the different thresholds of the {olscore} at where
- the metric is calculated. Default value is {"at": [0.25, 0.50, 0.75]}.
-
- `recall`:::
- (Optional, object) Set the different thresholds of the {olscore} at where
- the metric is calculated. Default value is {"at": [0.25, 0.50, 0.75]}.
-
- [[regression-evaluation-resources]]
- === {regression-cap} evaluation objects
- {regression-cap} evaluation evaluates the results of a {regression} analysis
- which outputs a prediction of values.
- `actual_field`::
- (Required, string) The field of the `index` which contains the `ground truth`.
- The data type of this field must be numerical.
-
- `predicted_field`::
- (Required, string) The field in the `index` that contains the predicted value,
- in other words the results of the {regression} analysis.
-
- `metrics`::
- (Optional, object) Specifies the metrics that are used for the evaluation. For
- more information on `mse`, `msle`, and `huber`, consult
- https://github.com/elastic/examples/tree/master/Machine%20Learning/Regression%20Loss%20Functions[the Jupyter notebook on regression loss functions].
- If no metrics are specified, the following are returned by default:
-
- * `mse`,
- * `r_squared`,
- * `huber` (`delta`: 1.0).
- `mse`:::
- (Optional, object) Average squared difference between the predicted values
- and the actual (`ground truth`) value. For more information, read
- {wikipedia}/Mean_squared_error[this wiki article].
- `msle`:::
- (Optional, object) Average squared difference between the logarithm of the
- predicted values and the logarithm of the actual (`ground truth`) value.
-
- `offset`::::
- (Optional, double) Defines the transition point at which you switch from
- minimizing quadratic error to minimizing quadratic log error. Defaults to
- `1`.
- `huber`:::
- (Optional, object) Pseudo Huber loss function. For more information, read
- {wikipedia}/Huber_loss#Pseudo-Huber_loss_function[this wiki article].
-
- `delta`::::
- (Optional, double) Approximates 1/2 (prediction - actual)^2^ for values
- much less than delta and approximates a straight line with slope delta for
- values much larger than delta. Defaults to `1`. Delta needs to be greater
- than `0`.
- `r_squared`:::
- (Optional, object) Proportion of the variance in the dependent variable that
- is predictable from the independent variables. For more information, read
- {wikipedia}/Coefficient_of_determination[this wiki article].
-
- [[classification-evaluation-resources]]
- == {classification-cap} evaluation objects
- {classification-cap} evaluation evaluates the results of a {classanalysis} which
- outputs a prediction that identifies to which of the classes each document
- belongs.
- `actual_field`::
- (Required, string) The field of the `index` which contains the `ground truth`.
- The data type of this field must be categorical.
-
- `predicted_field`::
- (Optional, string) The field in the `index` which contains the predicted value,
- in other words the results of the {classanalysis}.
- `top_classes_field`::
- (Optional, string) The field of the `index` which is an array of documents
- of the form `{ "class_name": XXX, "class_probability": YYY }`.
- This field must be defined as `nested` in the mappings.
- `metrics`::
- (Optional, object) Specifies the metrics that are used for the evaluation. If
- no metrics are specificed, the following are returned by default:
-
- * `accuracy`,
- * `multiclass_confusion_matrix`,
- * `precision`,
- * `recall`.
- `accuracy`:::
- (Optional, object) Accuracy of predictions (per-class and overall).
- `auc_roc`:::
- (Optional, object) The AUC ROC (area under the curve of the receiver
- operating characteristic) score and optionally the curve.
- It is calculated for a specific class (provided as "class_name") treated as
- positive.
- `class_name`::::
- (Required, string) Name of the only class that is treated as positive
- during AUC ROC calculation. Other classes are treated as negative
- ("one-vs-all" strategy). All the evaluated documents must have
- `class_name` in the list of their top classes.
- `include_curve`::::
- (Optional, Boolean) Whether or not the curve should be returned in
- addition to the score. Default value is false.
- `multiclass_confusion_matrix`:::
- (Optional, object) Multiclass confusion matrix.
- `size`::::
- (Optional, double) Specifies the size of the multiclass confusion matrix.
- Defaults to `10` which results in a matrix of size 10x10.
- `precision`:::
- (Optional, object) Precision of predictions (per-class and average).
- `recall`:::
- (Optional, object) Recall of predictions (per-class and average).
- ////
- [[ml-evaluate-dfanalytics-results]]
- == {api-response-body-title}
- `outlier_detection`::
- (object) If you chose to do outlier detection, the API returns the
- following evaluation metrics:
-
- `auc_roc`::: TBD
- `confusion_matrix`::: TBD
-
- `precision`::: TBD
- `recall`::: TBD
- ////
- [[ml-evaluate-dfanalytics-example]]
- == {api-examples-title}
- [[ml-evaluate-oldetection-example]]
- === {oldetection-cap}
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "my_analytics_dest_index",
- "evaluation": {
- "outlier_detection": {
- "actual_field": "is_outlier",
- "predicted_probability_field": "ml.outlier_score"
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns the following results:
- [source,console-result]
- ----
- {
- "outlier_detection": {
- "auc_roc": {
- "value": 0.92584757746414444
- },
- "confusion_matrix": {
- "0.25": {
- "tp": 5,
- "fp": 9,
- "tn": 204,
- "fn": 5
- },
- "0.5": {
- "tp": 1,
- "fp": 5,
- "tn": 208,
- "fn": 9
- },
- "0.75": {
- "tp": 0,
- "fp": 4,
- "tn": 209,
- "fn": 10
- }
- },
- "precision": {
- "0.25": 0.35714285714285715,
- "0.5": 0.16666666666666666,
- "0.75": 0
- },
- "recall": {
- "0.25": 0.5,
- "0.5": 0.1,
- "0.75": 0
- }
- }
- }
- ----
- [[ml-evaluate-regression-example]]
- === {regression-cap}
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "house_price_predictions", <1>
- "query": {
- "bool": {
- "filter": [
- { "term": { "ml.is_training": false } } <2>
- ]
- }
- },
- "evaluation": {
- "regression": {
- "actual_field": "price", <3>
- "predicted_field": "ml.price_prediction", <4>
- "metrics": {
- "r_squared": {},
- "mse": {},
- "msle": {"offset": 10},
- "huber": {"delta": 1.5}
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- <1> The output destination index from a {dfanalytics} {reganalysis}.
- <2> In this example, a test/train split (`training_percent`) was defined for the
- {reganalysis}. This query limits evaluation to be performed on the test split
- only.
- <3> The ground truth value for the actual house price. This is required in order
- to evaluate results.
- <4> The predicted value for house price calculated by the {reganalysis}.
- The following example calculates the training error:
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "student_performance_mathematics_reg",
- "query": {
- "term": {
- "ml.is_training": {
- "value": true <1>
- }
- }
- },
- "evaluation": {
- "regression": {
- "actual_field": "G3", <2>
- "predicted_field": "ml.G3_prediction", <3>
- "metrics": {
- "r_squared": {},
- "mse": {},
- "msle": {},
- "huber": {}
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- <1> In this example, a test/train split (`training_percent`) was defined for the
- {reganalysis}. This query limits evaluation to be performed on the train split
- only. It means that a training error will be calculated.
- <2> The field that contains the ground truth value for the actual student
- performance. This is required in order to evaluate results.
- <3> The field that contains the predicted value for student performance
- calculated by the {reganalysis}.
- The next example calculates the testing error. The only difference compared with
- the previous example is that `ml.is_training` is set to `false` this time, so
- the query excludes the train split from the evaluation.
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "student_performance_mathematics_reg",
- "query": {
- "term": {
- "ml.is_training": {
- "value": false <1>
- }
- }
- },
- "evaluation": {
- "regression": {
- "actual_field": "G3", <2>
- "predicted_field": "ml.G3_prediction", <3>
- "metrics": {
- "r_squared": {},
- "mse": {},
- "msle": {},
- "huber": {}
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- <1> In this example, a test/train split (`training_percent`) was defined for the
- {reganalysis}. This query limits evaluation to be performed on the test split
- only. It means that a testing error will be calculated.
- <2> The field that contains the ground truth value for the actual student
- performance. This is required in order to evaluate results.
- <3> The field that contains the predicted value for student performance
- calculated by the {reganalysis}.
- [[ml-evaluate-classification-example]]
- === {classification-cap}
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "animal_classification",
- "evaluation": {
- "classification": { <1>
- "actual_field": "animal_class", <2>
- "predicted_field": "ml.animal_class_prediction", <3>
- "metrics": {
- "multiclass_confusion_matrix" : {} <4>
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- <1> The evaluation type.
- <2> The field that contains the ground truth value for the actual animal
- classification. This is required in order to evaluate results.
- <3> The field that contains the predicted value for animal classification by
- the {classanalysis}.
- <4> Specifies the metric for the evaluation.
- The API returns the following result:
- [source,console-result]
- --------------------------------------------------
- {
- "classification" : {
- "multiclass_confusion_matrix" : {
- "confusion_matrix" : [
- {
- "actual_class" : "cat", <1>
- "actual_class_doc_count" : 12, <2>
- "predicted_classes" : [ <3>
- {
- "predicted_class" : "cat",
- "count" : 12 <4>
- },
- {
- "predicted_class" : "dog",
- "count" : 0 <5>
- }
- ],
- "other_predicted_class_doc_count" : 0 <6>
- },
- {
- "actual_class" : "dog",
- "actual_class_doc_count" : 11,
- "predicted_classes" : [
- {
- "predicted_class" : "dog",
- "count" : 7
- },
- {
- "predicted_class" : "cat",
- "count" : 4
- }
- ],
- "other_predicted_class_doc_count" : 0
- }
- ],
- "other_actual_class_count" : 0
- }
- }
- }
- --------------------------------------------------
- <1> The name of the actual class that the analysis tried to predict.
- <2> The number of documents in the index that belong to the `actual_class`.
- <3> This object contains the list of the predicted classes and the number of
- predictions associated with the class.
- <4> The number of cats in the dataset that are correctly identified as cats.
- <5> The number of cats in the dataset that are incorrectly classified as dogs.
- <6> The number of documents that are classified as a class that is not listed as
- a `predicted_class`.
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/_evaluate
- {
- "index": "animal_classification",
- "evaluation": {
- "classification": { <1>
- "actual_field": "animal_class", <2>
- "metrics": {
- "auc_roc" : { <3>
- "class_name": "dog" <4>
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- <1> The evaluation type.
- <2> The field that contains the ground truth value for the actual animal
- classification. This is required in order to evaluate results.
- <3> Specifies the metric for the evaluation.
- <4> Specifies the class name that is treated as positive during the evaluation,
- all the other classes are treated as negative.
- The API returns the following result:
- [source,console-result]
- --------------------------------------------------
- {
- "classification" : {
- "auc_roc" : {
- "value" : 0.8941788639536681
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
|