1
0

post-data.asciidoc 3.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
  1. [role="xpack"]
  2. [testenv="platinum"]
  3. [[ml-post-data]]
  4. === Post data to jobs API
  5. ++++
  6. <titleabbrev>Post data to jobs</titleabbrev>
  7. ++++
  8. Sends data to an anomaly detection job for analysis.
  9. ==== Request
  10. `POST _ml/anomaly_detectors/<job_id>/_data`
  11. ==== Description
  12. The job must have a state of `open` to receive and process the data.
  13. The data that you send to the job must use the JSON format. Multiple JSON
  14. documents can be sent, either adjacent with no separator in between them or
  15. whitespace separated. Newline delimited JSON (NDJSON) is a possible whitespace
  16. separated format, and for this the `Content-Type` header should be set to
  17. `application/x-ndjson`.
  18. Upload sizes are limited to the Elasticsearch HTTP receive buffer size
  19. (default 100 Mb). If your data is larger, split it into multiple chunks
  20. and upload each one separately in sequential time order. When running in
  21. real time, it is generally recommended that you perform many small uploads,
  22. rather than queueing data to upload larger files.
  23. When uploading data, check the <<ml-datacounts,job data counts>> for progress.
  24. The following records will not be processed:
  25. * Records not in chronological order and outside the latency window
  26. * Records with an invalid timestamp
  27. //TBD link to Working with Out of Order timeseries concept doc
  28. IMPORTANT: For each job, data can only be accepted from a single connection at
  29. a time. It is not currently possible to post data to multiple jobs using wildcards
  30. or a comma-separated list.
  31. ==== Path Parameters
  32. `job_id` (required)::
  33. (string) Identifier for the job
  34. ==== Query Parameters
  35. `reset_start`::
  36. (string) Specifies the start of the bucket resetting range
  37. `reset_end`::
  38. (string) Specifies the end of the bucket resetting range
  39. ==== Request Body
  40. A sequence of one or more JSON documents containing the data to be analyzed.
  41. Only whitespace characters are permitted in between the documents.
  42. ==== Authorization
  43. You must have `manage_ml`, or `manage` cluster privileges to use this API.
  44. For more information, see
  45. {xpack-ref}/security-privileges.html[Security Privileges].
  46. //<<privileges-list-cluster>>.
  47. ==== Examples
  48. The following example posts data from the it_ops_new_kpi.json file to the `it_ops_new_kpi` job:
  49. [source,js]
  50. --------------------------------------------------
  51. $ curl -s -H "Content-type: application/json"
  52. -X POST http:\/\/localhost:9200/_ml/anomaly_detectors/it_ops_new_kpi/_data
  53. --data-binary @it_ops_new_kpi.json
  54. --------------------------------------------------
  55. When the data is sent, you receive information about the operational progress of the job.
  56. For example:
  57. [source,js]
  58. ----
  59. {
  60. "job_id":"it_ops_new_kpi",
  61. "processed_record_count":21435,
  62. "processed_field_count":64305,
  63. "input_bytes":2589063,
  64. "input_field_count":85740,
  65. "invalid_date_count":0,
  66. "missing_field_count":0,
  67. "out_of_order_timestamp_count":0,
  68. "empty_bucket_count":16,
  69. "sparse_bucket_count":0,
  70. "bucket_count":2165,
  71. "earliest_record_timestamp":1454020569000,
  72. "latest_record_timestamp":1455318669000,
  73. "last_data_time":1491952300658,
  74. "latest_empty_bucket_timestamp":1454541600000,
  75. "input_record_count":21435
  76. }
  77. ----
  78. For more information about these properties, see <<ml-jobstats,Job Stats>>.