Browse Source

[DOCS] Moves model snapshot resource definitions into APIs (#50157)

Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>
Lisa Cawley 5 years ago
parent
commit
207094cd67

+ 4 - 2
docs/reference/ml/anomaly-detection/apis/delete-snapshot.asciidoc

@@ -30,10 +30,12 @@ the `model_snapshot_id` in the results from the get jobs API.
 ==== {api-path-parms-title}
 
 `<job_id>`::
-  (Required, string) Identifier for the job.
+(Required, string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 
 `<snapshot_id>`::
-  (Required, string) Identifier for the model snapshot.
+(Required, string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=snapshot-id]
 
 [[ml-delete-snapshot-example]]
 ==== {api-examples-title}

+ 120 - 31
docs/reference/ml/anomaly-detection/apis/get-snapshot.asciidoc

@@ -30,8 +30,13 @@ Retrieves information about model snapshots.
 include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 
 `<snapshot_id>`::
-  (Optional, string) Identifier for the model snapshot. If you do not specify
-  this optional parameter, the API returns information about all model snapshots.
+(Optional, string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=snapshot-id]
++
+--
+If you do not specify this optional parameter, the API returns information about
+all model snapshots.
+--
 
 [[ml-get-snapshot-request-body]]
 ==== {api-request-body-title}
@@ -58,52 +63,136 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 [[ml-get-snapshot-results]]
 ==== {api-response-body-title}
 
-The API returns the following information:
+The API returns an array of model snapshot objects, which have the following 
+properties:
 
-`model_snapshots`::
-  (array) An array of model snapshot objects. For more information, see
-  <<ml-snapshot-resource>>.
+`description`::
+(string) An optional description of the job.
+
+`job_id`::
+(string) A numerical character string that uniquely identifies the job that
+  the snapshot was created for.
+  
+`latest_record_time_stamp`::
+(date) The timestamp of the latest processed record.
+
+`latest_result_time_stamp`::
+(date) The timestamp of the latest bucket result.
+
+`min_version`::
+(string) The minimum version required to be able to restore the model snapshot.
+
+`model_size_stats`::
+(object) Summary information describing the model.
+
+`model_size_stats`.`bucket_allocation_failures_count`:::
+(long) The number of buckets for which entities were not processed due to memory
+limit constraints.
+
+`model_size_stats`.`job_id`:::
+(string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
+
+`model_size_stats`.`log_time`:::
+(date) The timestamp that the `model_size_stats` were recorded, according to
+server-time.
+
+`model_size_stats`.`memory_status`:::
+(string) The status of the memory in relation to its `model_memory_limit`.
+Contains one of the following values.
++
+--
+* `hard_limit`: The internal models require more space that the configured
+memory limit. Some incoming data could not be processed.
+* `ok`: The internal models stayed below the configured value.
+* `soft_limit`: The internal models require more than 60% of the configured
+memory limit and more aggressive pruning will be performed in order to try to
+reclaim space.
+--
+
+`model_size_stats`.`model_bytes`:::
+(long) An approximation of the memory resources required for this analysis.
+
+`model_size_stats`.`model_bytes_exceeded`:::
+(long) The number of bytes over the high limit for memory usage at the last allocation failure.
+
+`model_size_stats`.`model_bytes_memory_limit`:::
+(long) The upper limit for memory usage, checked on increasing values.
+
+`model_size_stats`.`result_type`:::
+(string) Internal. This value is always `model_size_stats`.
+
+`model_size_stats`.`timestamp`:::
+(date) The timestamp that the `model_size_stats` were recorded, according to the
+bucket timestamp of the data.
+
+`model_size_stats`.`total_by_field_count`:::
+(long) The number of _by_ field values analyzed. Note that these are counted
+separately for each detector and partition.
+
+`model_size_stats`.`total_over_field_count`:::
+(long) The number of _over_ field values analyzed. Note that these are counted
+separately for each detector and partition.
+
+`model_size_stats`.`total_partition_field_count`:::
+(long) The number of _partition_ field values analyzed.
+
+`retain`::
+(boolean)
+include::{docdir}/ml/ml-shared.asciidoc[tag=retain]
+
+`snapshot_id`::
+(string) A numerical character string that uniquely identifies the model
+snapshot. For example: "1491852978".
+
+`snapshot_doc_count`::
+(long) For internal use only.
+
+`timestamp`::
+(date) The creation timestamp for the snapshot.
 
 [[ml-get-snapshot-example]]
 ==== {api-examples-title}
 
 [source,console]
 --------------------------------------------------
-GET _ml/anomaly_detectors/farequote/model_snapshots
+GET _ml/anomaly_detectors/high_sum_total_sales/model_snapshots
 {
-  "start": "1491852977000"
+  "start": "1575402236000"
 }
 --------------------------------------------------
-// TEST[skip:todo]
+// TEST[skip:Kibana sample data]
 
 In this example, the API provides a single result:
 [source,js]
 ----
 {
-  "count": 1,
-  "model_snapshots": [
+  "count" : 1,
+  "model_snapshots" : [
     {
-      "job_id": "farequote",
-      "min_version": "6.3.0",
-      "timestamp": 1491948163000,
-      "description": "State persisted due to job close at 2017-04-11T15:02:43-0700",
-      "snapshot_id": "1491948163",
-      "snapshot_doc_count": 1,
-      "model_size_stats": {
-        "job_id": "farequote",
-        "result_type": "model_size_stats",
-        "model_bytes": 387594,
-        "total_by_field_count": 21,
-        "total_over_field_count": 0,
-        "total_partition_field_count": 20,
-        "bucket_allocation_failures_count": 0,
-        "memory_status": "ok",
-        "log_time": 1491948163000,
-        "timestamp": 1455234600000
+      "job_id" : "high_sum_total_sales",
+      "min_version" : "6.4.0",
+      "timestamp" : 1575402237000,
+      "description" : "State persisted due to job close at 2019-12-03T19:43:57+0000",
+      "snapshot_id" : "1575402237",
+      "snapshot_doc_count" : 1,
+      "model_size_stats" : {
+        "job_id" : "high_sum_total_sales",
+        "result_type" : "model_size_stats",
+        "model_bytes" : 1638816,
+        "model_bytes_exceeded" : 0,
+        "model_bytes_memory_limit" : 10485760,
+        "total_by_field_count" : 3,
+        "total_over_field_count" : 3320,
+        "total_partition_field_count" : 2,
+        "bucket_allocation_failures_count" : 0,
+        "memory_status" : "ok",
+        "log_time" : 1575402237000,
+        "timestamp" : 1576965600000
       },
-      "latest_record_time_stamp": 1455235196000,
-      "latest_result_time_stamp": 1455234900000,
-      "retain": false
+      "latest_record_time_stamp" : 1576971072000,
+      "latest_result_time_stamp" : 1576965600000,
+      "retain" : false
     }
   ]
 }

+ 29 - 28
docs/reference/ml/anomaly-detection/apis/revert-snapshot.asciidoc

@@ -24,7 +24,7 @@ Reverts to a specific snapshot.
 [[ml-revert-snapshot-desc]]
 ==== {api-description-title}
 
-The {ml} feature in {xpack} reacts quickly to anomalous input, learning new
+The {ml-features} react quickly to anomalous input, learning new
 behaviors in data. Highly anomalous input increases the variance in the models
 whilst the system learns whether this is a new step-change in behavior or a
 one-off event. In the case where this anomalous input is known to be a one-off,
@@ -40,7 +40,8 @@ Friday or a critical system failure.
 include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 
 `<snapshot_id>`::
-  (Required, string) Identifier for the model snapshot.
+(Required, string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=snapshot-id]
 
 [[ml-revert-snapshot-request-body]]
 ==== {api-request-body-title}
@@ -57,13 +58,9 @@ If you want to resend data, then delete the intervening results.
 [[ml-revert-snapshot-example]]
 ==== {api-examples-title}
 
-The following example reverts to the `1491856080` snapshot for the
-`it_ops_new_kpi` job:
-
 [source,console]
 --------------------------------------------------
-POST
-_ml/anomaly_detectors/it_ops_new_kpi/model_snapshots/1491856080/_revert
+POST _ml/anomaly_detectors/high_sum_total_sales/model_snapshots/1575402237/_revert
 {
   "delete_intervening_results": true
 }
@@ -74,28 +71,32 @@ When the operation is complete, you receive the following results:
 [source,js]
 ----
 {
-  "model": {
-    "job_id": "it_ops_new_kpi",
-    "min_version": "6.3.0",
-    "timestamp": 1491856080000,
-    "description": "State persisted due to job close at 2017-04-10T13:28:00-0700",
-    "snapshot_id": "1491856080",
-    "snapshot_doc_count": 1,
-    "model_size_stats": {
-      "job_id": "it_ops_new_kpi",
-      "result_type": "model_size_stats",
-      "model_bytes": 29518,
-      "total_by_field_count": 3,
-      "total_over_field_count": 0,
-      "total_partition_field_count": 2,
-      "bucket_allocation_failures_count": 0,
-      "memory_status": "ok",
-      "log_time": 1491856080000,
-      "timestamp": 1455318000000
+  "model" : {
+    "job_id" : "high_sum_total_sales",
+    "min_version" : "6.4.0",
+    "timestamp" : 1575402237000,
+    "description" : "State persisted due to job close at 2019-12-03T19:43:57+0000",
+    "snapshot_id" : "1575402237",
+    "snapshot_doc_count" : 1,
+    "model_size_stats" : {
+      "job_id" : "high_sum_total_sales",
+      "result_type" : "model_size_stats",
+      "model_bytes" : 1638816,
+      "model_bytes_exceeded" : 0,
+      "model_bytes_memory_limit" : 10485760,
+      "total_by_field_count" : 3,
+      "total_over_field_count" : 3320,
+      "total_partition_field_count" : 2,
+      "bucket_allocation_failures_count" : 0,
+      "memory_status" : "ok",
+      "log_time" : 1575402237000,
+      "timestamp" : 1576965600000
     },
-    "latest_record_time_stamp": 1455318669000,
-    "latest_result_time_stamp": 1455318000000,
-    "retain": false
+    "latest_record_time_stamp" : 1576971072000,
+    "latest_result_time_stamp" : 1576965600000,
+    "retain" : false
   }
 }
 ----
+
+For a description of these properties, see the <<ml-get-snapshot-results,get model snapshots API>>.

+ 0 - 104
docs/reference/ml/anomaly-detection/apis/snapshotresource.asciidoc

@@ -1,104 +0,0 @@
-[role="xpack"]
-[testenv="platinum"]
-[[ml-snapshot-resource]]
-=== Model snapshot resources
-
-Model snapshots are saved to an internal index within the Elasticsearch cluster.
-By default, this is occurs approximately every 3 hours to 4 hours and is
-configurable with the `background_persist_interval` property.
-
-By default, model snapshots are retained for one day (twenty-four hours). You
-can change this behavior by updating the `model_snapshot_retention_days` for the
-job. When choosing a new value, consider the following:
-
-* Persistence enables resilience in the event of a system failure.
-* Persistence enables snapshots to be reverted.
-* The time taken to persist a job is proportional to the size of the model in memory.
-
-A model snapshot resource has the following properties:
-
-`description`::
-  (string) An optional description of the job.
-
-`job_id`::
-  (string) A numerical character string that uniquely identifies the job that
-  the snapshot was created for.
-
-`min_version`::
-  (string) The minimum version required to be able to restore the model snapshot.
-
-`latest_record_time_stamp`::
-  (date) The timestamp of the latest processed record.
-
-`latest_result_time_stamp`::
-  (date) The timestamp of the latest bucket result.
-
-`model_size_stats`::
-  (object) Summary information describing the model.
-  See <<ml-snapshot-stats,Model Size Statistics>>.
-
-`retain`::
-  (boolean) If true, this snapshot will not be deleted during automatic cleanup
-  of snapshots older than `model_snapshot_retention_days`.
-  However, this snapshot will be deleted when the job is deleted.
-  The default value is false.
-
-`snapshot_id`::
-  (string) A numerical character string that uniquely identifies the model
-  snapshot. For example: "1491852978".
-
-`snapshot_doc_count`::
-  (long) For internal use only.
-
-`timestamp`::
-  (date) The creation timestamp for the snapshot.
-
-NOTE: All of these properties are informational with the exception of
-`description` and `retain`.
-
-[float]
-[[ml-snapshot-stats]]
-==== Model Size Statistics
-
-The `model_size_stats` object has the following properties:
-
-`bucket_allocation_failures_count`::
-  (long) The number of buckets for which entities were not processed due to
-  memory limit constraints.
-
-`job_id`::
-  (string) A numerical character string that uniquely identifies the job.
-
-`log_time`::
-  (date) The timestamp that the `model_size_stats` were recorded, according to
-  server-time.
-
-`memory_status`::
-  (string) The status of the memory in relation to its `model_memory_limit`.
-  Contains one of the following values.
-  `ok`::: The internal models stayed below the configured value.
-  `soft_limit`::: The internal models require more than 60% of the configured
-  memory limit and more aggressive pruning will
-  be performed in order to try to reclaim space.
-  `hard_limit`::: The internal models require more space that the configured
-  memory limit. Some incoming data could not be processed.
-
-`model_bytes`::
-  (long) An approximation of the memory resources required for this analysis.
-
-`result_type`::
-  (string) Internal. This value is always set to "model_size_stats".
-
-`timestamp`::
-  (date) The timestamp that the `model_size_stats` were recorded, according to the bucket timestamp of the data.
-
-`total_by_field_count`::
-  (long) The number of _by_ field values analyzed. Note that these are counted separately for each detector and partition.
-
-`total_over_field_count`::
-  (long) The number of _over_ field values analyzed. Note that these are counted separately for each detector and partition.
-
-`total_partition_field_count`::
-  (long) The number of _partition_ field values analyzed.
-
-NOTE: All of these properties are informational; you cannot change their values.

+ 3 - 3
docs/reference/ml/anomaly-detection/apis/update-job.asciidoc

@@ -64,9 +64,9 @@ NOTE: You can update the `analysis_limits` only while the job is closed. The
 `model_memory_limit` property value cannot be decreased below the current usage.
  
 TIP: If the `memory_status` property in the
-<<ml-snapshot-stats,`model_size_stats` object>> has a value of `hard_limit`,
-this means that it was unable to process some data. You might want to re-run
-the job with an increased `model_memory_limit`.
+<<ml-get-snapshot-results,`model_size_stats` object>> has a value of `hard_limit`,
+this means that it was unable to process some data. You might want to re-run the
+job with an increased `model_memory_limit`.
 
 --
 

+ 6 - 7
docs/reference/ml/anomaly-detection/apis/update-snapshot.asciidoc

@@ -29,7 +29,8 @@ Updates certain properties of a snapshot.
 include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 
 `<snapshot_id>`::
-  (Required, string) Identifier for the model snapshot.
+(Required, string)
+include::{docdir}/ml/ml-shared.asciidoc[tag=snapshot-id]
 
 [[ml-update-snapshot-request-body]]
 ==== {api-request-body-title}
@@ -37,14 +38,12 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection]
 The following properties can be updated after the model snapshot is created:
 
 `description`::
-  (Optional, string) A description of the model snapshot. For example,
-  "Before black friday".
+(Optional, string) A description of the model snapshot.
 
 `retain`::
-  (Optional, boolean) If true, this snapshot will not be deleted during
-  automatic cleanup of snapshots older than `model_snapshot_retention_days`.
-  Note that this snapshot will still be deleted when the {anomaly-job} is
-  deleted. The default value is false.
+(Optional, boolean)
+include::{docdir}/ml/ml-shared.asciidoc[tag=retain]
+
 
 [[ml-update-snapshot-example]]
 ==== {api-examples-title}

+ 11 - 2
docs/reference/ml/ml-shared.asciidoc

@@ -895,8 +895,7 @@ end::model-plot-config[]
 
 tag::model-snapshot-id[]
 A numerical character string that uniquely identifies the model snapshot. For 
-example, `1491007364`. For more information about model snapshots, see
-<<ml-snapshot-resource>>.
+example, `1575402236000 `.
 end::model-snapshot-id[]
 
 tag::model-snapshot-retention-days[]
@@ -1006,6 +1005,12 @@ are deleted from {es}. The default value is null, which means results are
 retained.
 end::results-retention-days[]
 
+tag::retain[]
+If `true`, this snapshot will not be deleted during automatic cleanup of
+snapshots older than `model_snapshot_retention_days`. However, this snapshot
+will be deleted when the job is deleted. The default value is `false`.
+end::retain[]
+
 tag::script-fields[]
 Specifies scripts that evaluate custom expressions and returns script fields to
 the {dfeed}. The detector configuration objects in a job can contain functions
@@ -1023,6 +1028,10 @@ Specifies the maximum number of {dfanalytics-jobs} to obtain. The default value
 is `100`.
 end::size[]
 
+tag::snapshot-id[]
+Identifier for the model snapshot.
+end::snapshot-id[]
+
 tag::source-put-dfa[]
 The configuration of how to source the analysis data. It requires an 
 `index`. Optionally, `query` and `_source` may be specified.

+ 6 - 0
docs/reference/redirects.asciidoc

@@ -1082,3 +1082,9 @@ See
 [[ml-stats-node]]
 the details in <<ml-get-job-stats>>.
 
+[role="exclude",id="ml-snapshot-resource"]
+=== Model snapshot resources
+
+This page was deleted.
+[[ml-snapshot-stats]]
+See <<ml-update-snapshot>> and <<ml-get-snapshot>>.

+ 0 - 2
docs/reference/rest-api/defs.asciidoc

@@ -7,14 +7,12 @@ These resource definitions are used in APIs related to {ml-features} and
 
 
 * <<ml-dfa-analysis-objects>>
-* <<ml-snapshot-resource,{anomaly-detect-cap} model snapshots>>
 * <<ml-results-resource,{anomaly-detect-cap} results>>
 * <<role-mapping-resources,Role mappings>>
 * <<transform-resource,{transforms-cap}>>
 
 
 include::{es-repo-dir}/ml/df-analytics/apis/analysisobjects.asciidoc[]
-include::{es-repo-dir}/ml/anomaly-detection/apis/snapshotresource.asciidoc[]
 include::{xes-repo-dir}/rest-api/security/role-mapping-resources.asciidoc[]
 include::{es-repo-dir}/ml/anomaly-detection/apis/resultsresource.asciidoc[]
 include::{es-repo-dir}/transform/apis/transformresource.asciidoc[]