datafeedresource.asciidoc 4.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119
  1. [role="xpack"]
  2. [testenv="platinum"]
  3. [[ml-datafeed-resource]]
  4. === {dfeed-cap} Resources
  5. A {dfeed} resource has the following properties:
  6. `aggregations`::
  7. (object) If set, the {dfeed} performs aggregation searches.
  8. Support for aggregations is limited and should only be used with
  9. low cardinality data. For more information, see
  10. {xpack-ref}/ml-configuring-aggregation.html[Aggregating Data for Faster Performance].
  11. `chunking_config`::
  12. (object) Specifies how data searches are split into time chunks.
  13. See <<ml-datafeed-chunking-config>>.
  14. For example: `{"mode": "manual", "time_span": "3h"}`
  15. `datafeed_id`::
  16. (string) A numerical character string that uniquely identifies the {dfeed}.
  17. This property is informational; you cannot change the identifier for existing
  18. {dfeeds}.
  19. `frequency`::
  20. (time units) The interval at which scheduled queries are made while the
  21. {dfeed} runs in real time. The default value is either the bucket span for short
  22. bucket spans, or, for longer bucket spans, a sensible fraction of the bucket
  23. span. For example: `150s`.
  24. `indices`::
  25. (array) An array of index names. For example: `["it_ops_metrics"]`
  26. `job_id`::
  27. (string) The unique identifier for the job to which the {dfeed} sends data.
  28. `query`::
  29. (object) The {es} query domain-specific language (DSL). This value
  30. corresponds to the query object in an {es} search POST body. All the
  31. options that are supported by {es} can be used, as this object is
  32. passed verbatim to {es}. By default, this property has the following
  33. value: `{"match_all": {"boost": 1}}`.
  34. `query_delay`::
  35. (time units) The number of seconds behind real time that data is queried. For
  36. example, if data from 10:04 a.m. might not be searchable in {es} until
  37. 10:06 a.m., set this property to 120 seconds. The default value is randomly
  38. selected between `60s` and `120s`. This randomness improves the query
  39. performance when there are multiple jobs running on the same node.
  40. `script_fields`::
  41. (object) Specifies scripts that evaluate custom expressions and returns
  42. script fields to the {dfeed}.
  43. The <<ml-detectorconfig,detector configuration objects>> in a job can contain
  44. functions that use these script fields.
  45. For more information, see
  46. {xpack-ref}/ml-configuring-transform.html[Transforming Data With Script Fields].
  47. `scroll_size`::
  48. (unsigned integer) The `size` parameter that is used in {es} searches.
  49. The default value is `1000`.
  50. `types`::
  51. (array) A list of types to search for within the specified indices. For
  52. example: `[]`. This property is provided for backwards compatibility with
  53. releases earlier than 6.0.0. For more information, see <<removal-of-types>>.
  54. [[ml-datafeed-chunking-config]]
  55. ==== Chunking Configuration Objects
  56. {dfeeds-cap} might be required to search over long time periods, for several months
  57. or years. This search is split into time chunks in order to ensure the load
  58. on {es} is managed. Chunking configuration controls how the size of these time
  59. chunks are calculated and is an advanced configuration option.
  60. A chunking configuration object has the following properties:
  61. `mode`::
  62. There are three available modes: +
  63. `auto`::: The chunk size will be dynamically calculated. This is the default
  64. and recommended value.
  65. `manual`::: Chunking will be applied according to the specified `time_span`.
  66. `off`::: No chunking will be applied.
  67. `time_span`::
  68. (time units) The time span that each search will be querying.
  69. This setting is only applicable when the mode is set to `manual`.
  70. For example: `3h`.
  71. [float]
  72. [[ml-datafeed-counts]]
  73. ==== {dfeed-cap} Counts
  74. The get {dfeed} statistics API provides information about the operational
  75. progress of a {dfeed}. All of these properties are informational; you cannot
  76. update their values:
  77. `assignment_explanation`::
  78. (string) For started {dfeeds} only, contains messages relating to the
  79. selection of a node.
  80. `datafeed_id`::
  81. (string) A numerical character string that uniquely identifies the {dfeed}.
  82. `node`::
  83. (object) The node upon which the {dfeed} is started. The {dfeed} and job will
  84. be on the same node.
  85. `id`::: The unique identifier of the node. For example,
  86. "0-o0tOoRTwKFZifatTWKNw".
  87. `name`::: The node name. For example, `0-o0tOo`.
  88. `ephemeral_id`::: The node ephemeral ID.
  89. `transport_address`::: The host and port where transport HTTP connections are
  90. accepted. For example, `127.0.0.1:9300`.
  91. `attributes`::: For example, `{"ml.max_open_jobs": "10"}`.
  92. `state`::
  93. (string) The status of the {dfeed}, which can be one of the following values: +
  94. `started`::: The {dfeed} is actively receiving data.
  95. `stopped`::: The {dfeed} is stopped and will not receive data until it is
  96. re-started.