123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168 |
- [role="xpack"]
- [[ml-settings]]
- === Machine learning settings in Elasticsearch
- ++++
- <titleabbrev>Machine learning settings</titleabbrev>
- ++++
- [[ml-settings-description]]
- // tag::ml-settings-description-tag[]
- You do not need to configure any settings to use {ml}. It is enabled by default.
- IMPORTANT: {ml-cap} uses SSE4.2 instructions, so it works only on machines whose
- CPUs {wikipedia}/SSE4#Supporting_CPUs[support] SSE4.2. If you run {es} on older
- hardware, you must disable {ml} (by setting `xpack.ml.enabled` to `false`).
- // end::ml-settings-description-tag[]
- [discrete]
- [[general-ml-settings]]
- ==== General machine learning settings
- `node.roles: [ ml ]`::
- (<<static-cluster-setting,Static>>) Set `node.roles` to contain `ml` to identify
- the node as a _{ml} node_ that is capable of running jobs. Every node is a {ml}
- node by default.
- +
- If you use the `node.roles` setting, then all required roles must be explicitly
- set. Consult <<modules-node>> to learn more.
- +
- IMPORTANT: On dedicated coordinating nodes or dedicated master nodes, do not set
- the `ml` role.
- `xpack.ml.enabled`::
- (<<static-cluster-setting,Static>>) Set to `true` (default) to enable {ml} APIs
- on the node.
- +
- If set to `false`, the {ml} APIs are disabled on the node. Therefore the node
- cannot open jobs, start {dfeeds}, or receive transport (internal) communication
- requests related to {ml} APIs. If the node is a coordinating node, {ml} requests
- from clients (including {kib}) also fail. For more information about disabling
- {ml} in specific {kib} instances, see
- {kibana-ref}/ml-settings-kb.html[{kib} {ml} settings].
- +
- IMPORTANT: If you want to use {ml-features} in your cluster, it is recommended
- that you set `xpack.ml.enabled` to `true` on all nodes. This is the default
- behavior. At a minimum, it must be enabled on all master-eligible nodes. If you
- want to use {ml-features} in clients or {kib}, it must also be enabled on all
- coordinating nodes.
- `xpack.ml.inference_model.cache_size`::
- (<<static-cluster-setting,Static>>) The maximum inference cache size allowed.
- The inference cache exists in the JVM heap on each ingest node. The cache
- affords faster processing times for the `inference` processor. The value can be
- a static byte sized value (i.e. "2gb") or a percentage of total allocated heap.
- The default is "40%". See also <<model-inference-circuit-breaker>>.
- [[xpack-interference-model-ttl]]
- // tag::interference-model-ttl-tag[]
- `xpack.ml.inference_model.time_to_live` {ess-icon}::
- (<<static-cluster-setting,Static>>) The time to live (TTL) for models in the
- inference model cache. The TTL is calculated from last access. The `inference`
- processor attempts to load the model from cache. If the `inference` processor
- does not receive any documents for the duration of the TTL, the referenced model
- is flagged for eviction from the cache. If a document is processed later, the
- model is again loaded into the cache. Defaults to `5m`.
- // end::interference-model-ttl-tag[]
- `xpack.ml.max_inference_processors`::
- (<<cluster-update-settings,Dynamic>>) The total number of `inference` type
- processors allowed across all ingest pipelines. Once the limit is reached,
- adding an `inference` processor to a pipeline is disallowed. Defaults to `50`.
- `xpack.ml.max_machine_memory_percent`::
- (<<cluster-update-settings,Dynamic>>) The maximum percentage of the machine's
- memory that {ml} may use for running analytics processes. (These processes are
- separate to the {es} JVM.) Defaults to `30` percent. The limit is based on the
- total memory of the machine, not current free memory. Jobs are not allocated to
- a node if doing so would cause the estimated memory use of {ml} jobs to exceed
- the limit.
- `xpack.ml.max_model_memory_limit`::
- (<<cluster-update-settings,Dynamic>>) The maximum `model_memory_limit` property
- value that can be set for any job on this node. If you try to create a job with
- a `model_memory_limit` property value that is greater than this setting value,
- an error occurs. Existing jobs are not affected when you update this setting.
- For more information about the `model_memory_limit` property, see
- <<put-analysislimits>>.
- [[xpack.ml.max_open_jobs]]
- `xpack.ml.max_open_jobs`::
- (<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can run
- simultaneously on a node. Defaults to `20`. In this context, jobs include both
- {anomaly-jobs} and {dfanalytics-jobs}. The maximum number of jobs is also
- constrained by memory usage. Thus if the estimated memory usage of the jobs
- would be higher than allowed, fewer jobs will run on a node. Prior to version
- 7.1, this setting was a per-node non-dynamic setting. It became a cluster-wide
- dynamic setting in version 7.1. As a result, changes to its value after node
- startup are used only after every node in the cluster is running version 7.1 or
- higher. The maximum permitted value is `512`.
- `xpack.ml.node_concurrent_job_allocations`::
- (<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can
- concurrently be in the `opening` state on each node. Typically, jobs spend a
- small amount of time in this state before they move to `open` state. Jobs that
- must restore large models when they are opening spend more time in the `opening`
- state. Defaults to `2`.
- [discrete]
- [[advanced-ml-settings]]
- ==== Advanced machine learning settings
- These settings are for advanced use cases; the default values are generally
- sufficient:
- `xpack.ml.enable_config_migration`::
- (<<cluster-update-settings,Dynamic>>) Reserved.
- `xpack.ml.max_anomaly_records`::
- (<<cluster-update-settings,Dynamic>>) The maximum number of records that are
- output per bucket. The default value is `500`.
- `xpack.ml.max_lazy_ml_nodes`::
- (<<cluster-update-settings,Dynamic>>) The number of lazily spun up {ml} nodes.
- Useful in situations where {ml} nodes are not desired until the first {ml} job
- opens. It defaults to `0` and has a maximum acceptable value of `3`. If the
- current number of {ml} nodes is greater than or equal to this setting, it is
- assumed that there are no more lazy nodes available as the desired number
- of nodes have already been provisioned. If a job is opened and this setting has
- a value greater than zero and there are no nodes that can accept the job, the
- job stays in the `OPENING` state until a new {ml} node is added to the cluster
- and the job is assigned to run on that node.
- +
- IMPORTANT: This setting assumes some external process is capable of adding {ml}
- nodes to the cluster. This setting is only useful when used in conjunction with
- such an external process.
- `xpack.ml.process_connect_timeout`::
- (<<cluster-update-settings,Dynamic>>) The connection timeout for {ml} processes
- that run separately from the {es} JVM. Defaults to `10s`. Some {ml} processing
- is done by processes that run separately to the {es} JVM. When such processes
- are started they must connect to the {es} JVM. If such a process does not
- connect within the time period specified by this setting then the process is
- assumed to have failed. Defaults to `10s`. The minimum value for this setting is
- `5s`.
- [discrete]
- [[model-inference-circuit-breaker]]
- ==== {ml-cap} circuit breaker settings
- `breaker.model_inference.limit`::
- (<<cluster-update-settings,Dynamic>>) Limit for the model inference breaker,
- which defaults to 50% of the JVM heap. If the parent circuit breaker is less
- than 50% of the JVM heap, it is bound to that limit instead. See
- <<circuit-breaker>>.
- `breaker.model_inference.overhead`::
- (<<cluster-update-settings,Dynamic>>) A constant that all accounting estimations
- are multiplied by to determine a final estimation. Defaults to 1. See
- <<circuit-breaker>>.
- `breaker.model_inference.type`::
- (<<static-cluster-setting,Static>>) The underlying type of the circuit breaker.
- There are two valid options: `noop` and `memory`. `noop` means the circuit
- breaker does nothing to prevent too much memory usage. `memory` means the
- circuit breaker tracks the memory used by inference models and can potentially
- break and prevent `OutOfMemory` errors. The default is `memory`.
|