datafeedresource.asciidoc 4.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
  1. [role="xpack"]
  2. [[ml-datafeed-resource]]
  3. === {dfeed-cap} Resources
  4. A {dfeed} resource has the following properties:
  5. `aggregations`::
  6. (object) If set, the {dfeed} performs aggregation searches.
  7. Support for aggregations is limited and should only be used with
  8. low cardinality data. For more information, see
  9. {xpack-ref}/ml-configuring-aggregation.html[Aggregating Data for Faster Performance].
  10. `chunking_config`::
  11. (object) Specifies how data searches are split into time chunks.
  12. See <<ml-datafeed-chunking-config>>.
  13. For example: `{"mode": "manual", "time_span": "3h"}`
  14. `datafeed_id`::
  15. (string) A numerical character string that uniquely identifies the {dfeed}.
  16. This property is informational; you cannot change the identifier for existing
  17. {dfeeds}.
  18. `frequency`::
  19. (time units) The interval at which scheduled queries are made while the
  20. {dfeed} runs in real time. The default value is either the bucket span for short
  21. bucket spans, or, for longer bucket spans, a sensible fraction of the bucket
  22. span. For example: `150s`.
  23. `indices`::
  24. (array) An array of index names. For example: `["it_ops_metrics"]`
  25. `job_id`::
  26. (string) The unique identifier for the job to which the {dfeed} sends data.
  27. `query`::
  28. (object) The {es} query domain-specific language (DSL). This value
  29. corresponds to the query object in an {es} search POST body. All the
  30. options that are supported by {es} can be used, as this object is
  31. passed verbatim to {es}. By default, this property has the following
  32. value: `{"match_all": {"boost": 1}}`.
  33. `query_delay`::
  34. (time units) The number of seconds behind real time that data is queried. For
  35. example, if data from 10:04 a.m. might not be searchable in {es} until
  36. 10:06 a.m., set this property to 120 seconds. The default value is randomly
  37. selected between `60s` and `120s`. This randomness improves the query
  38. performance when there are multiple jobs running on the same node.
  39. `script_fields`::
  40. (object) Specifies scripts that evaluate custom expressions and returns
  41. script fields to the {dfeed}.
  42. The <<ml-detectorconfig,detector configuration objects>> in a job can contain
  43. functions that use these script fields.
  44. For more information, see
  45. {xpack-ref}/ml-configuring-transform.html[Transforming Data With Script Fields].
  46. `scroll_size`::
  47. (unsigned integer) The `size` parameter that is used in {es} searches.
  48. The default value is `1000`.
  49. `types`::
  50. (array) A list of types to search for within the specified indices. For
  51. example: `[]`. This property is provided for backwards compatibility with
  52. releases earlier than 6.0.0. For more information, see <<removal-of-types>>.
  53. [[ml-datafeed-chunking-config]]
  54. ==== Chunking Configuration Objects
  55. {dfeeds-cap} might be required to search over long time periods, for several months
  56. or years. This search is split into time chunks in order to ensure the load
  57. on {es} is managed. Chunking configuration controls how the size of these time
  58. chunks are calculated and is an advanced configuration option.
  59. A chunking configuration object has the following properties:
  60. `mode`::
  61. There are three available modes: +
  62. `auto`::: The chunk size will be dynamically calculated. This is the default
  63. and recommended value.
  64. `manual`::: Chunking will be applied according to the specified `time_span`.
  65. `off`::: No chunking will be applied.
  66. `time_span`::
  67. (time units) The time span that each search will be querying.
  68. This setting is only applicable when the mode is set to `manual`.
  69. For example: `3h`.
  70. [float]
  71. [[ml-datafeed-counts]]
  72. ==== {dfeed-cap} Counts
  73. The get {dfeed} statistics API provides information about the operational
  74. progress of a {dfeed}. All of these properties are informational; you cannot
  75. update their values:
  76. `assignment_explanation`::
  77. (string) For started {dfeeds} only, contains messages relating to the
  78. selection of a node.
  79. `datafeed_id`::
  80. (string) A numerical character string that uniquely identifies the {dfeed}.
  81. `node`::
  82. (object) The node upon which the {dfeed} is started. The {dfeed} and job will
  83. be on the same node.
  84. `id`::: The unique identifier of the node. For example,
  85. "0-o0tOoRTwKFZifatTWKNw".
  86. `name`::: The node name. For example, `0-o0tOo`.
  87. `ephemeral_id`::: The node ephemeral ID.
  88. `transport_address`::: The host and port where transport HTTP connections are
  89. accepted. For example, `127.0.0.1:9300`.
  90. `attributes`::: For example, `{"ml.max_open_jobs": "10"}`.
  91. `state`::
  92. (string) The status of the {dfeed}, which can be one of the following values: +
  93. `started`::: The {dfeed} is actively receiving data.
  94. `stopped`::: The {dfeed} is stopped and will not receive data until it is
  95. re-started.