123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111 |
- [role="xpack"]
- [testenv="platinum"]
- [[ml-post-data]]
- === Post data to jobs API
- ++++
- <titleabbrev>Post data to jobs</titleabbrev>
- ++++
- Sends data to an anomaly detection job for analysis.
- [[ml-post-data-request]]
- ==== {api-request-title}
- `POST _ml/anomaly_detectors/<job_id>/_data`
- [[ml-post-data-prereqs]]
- ==== {api-prereq-title}
- * If the {es} {security-features} are enabled, you must have `manage_ml` or
- `manage` cluster privileges to use this API. See
- <<security-privileges>>.
- [[ml-post-data-desc]]
- ==== {api-description-title}
- The job must have a state of `open` to receive and process the data.
- The data that you send to the job must use the JSON format. Multiple JSON
- documents can be sent, either adjacent with no separator in between them or
- whitespace separated. Newline delimited JSON (NDJSON) is a possible whitespace
- separated format, and for this the `Content-Type` header should be set to
- `application/x-ndjson`.
- Upload sizes are limited to the Elasticsearch HTTP receive buffer size
- (default 100 Mb). If your data is larger, split it into multiple chunks
- and upload each one separately in sequential time order. When running in
- real time, it is generally recommended that you perform many small uploads,
- rather than queueing data to upload larger files.
- When uploading data, check the <<ml-datacounts,job data counts>> for progress.
- The following records will not be processed:
- * Records not in chronological order and outside the latency window
- * Records with an invalid timestamp
- //TBD link to Working with Out of Order timeseries concept doc
- IMPORTANT: For each job, data can only be accepted from a single connection at
- a time. It is not currently possible to post data to multiple jobs using wildcards
- or a comma-separated list.
- [[ml-post-data-path-parms]]
- ==== {api-path-parms-title}
- `<job_id>`::
- (Required, string) Identifier for the job.
- [[ml-post-data-query-parms]]
- ==== {api-query-parms-title}
- `reset_start`::
- (Optional, string) Specifies the start of the bucket resetting range.
- `reset_end`::
- (Optional, string) Specifies the end of the bucket resetting range.
- [[ml-post-data-request-body]]
- ==== {api-request-body-title}
- A sequence of one or more JSON documents containing the data to be analyzed.
- Only whitespace characters are permitted in between the documents.
- [[ml-post-data-example]]
- ==== {api-examples-title}
- The following example posts data from the `it_ops_new_kpi.json` file to the
- `it_ops_new_kpi` job:
- [source,js]
- --------------------------------------------------
- $ curl -s -H "Content-type: application/json"
- -X POST http:\/\/localhost:9200/_ml/anomaly_detectors/it_ops_new_kpi/_data
- --data-binary @it_ops_new_kpi.json
- --------------------------------------------------
- When the data is sent, you receive information about the operational progress of
- the job. For example:
- [source,js]
- ----
- {
- "job_id":"it_ops_new_kpi",
- "processed_record_count":21435,
- "processed_field_count":64305,
- "input_bytes":2589063,
- "input_field_count":85740,
- "invalid_date_count":0,
- "missing_field_count":0,
- "out_of_order_timestamp_count":0,
- "empty_bucket_count":16,
- "sparse_bucket_count":0,
- "bucket_count":2165,
- "earliest_record_timestamp":1454020569000,
- "latest_record_timestamp":1455318669000,
- "last_data_time":1491952300658,
- "latest_empty_bucket_timestamp":1454541600000,
- "input_record_count":21435
- }
- ----
- For more information about these properties, see <<ml-jobstats,Job Stats>>.
|