123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138 |
- [role="xpack"]
- [[ml-settings]]
- === Machine learning settings in Elasticsearch
- ++++
- <titleabbrev>Machine learning settings</titleabbrev>
- ++++
- You do not need to configure any settings to use {ml}. It is enabled by default.
- IMPORTANT: {ml-cap} uses SSE4.2 instructions, so will only work on machines whose
- CPUs https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs[support] SSE4.2. If you
- run {es} on older hardware you must disable {ml} (by setting `xpack.ml.enabled`
- to `false`).
- All of these settings can be added to the `elasticsearch.yml` configuration file.
- The dynamic settings can also be updated across a cluster with the
- <<cluster-update-settings,cluster update settings API>>.
- TIP: Dynamic settings take precedence over settings in the `elasticsearch.yml`
- file.
- [float]
- [[general-ml-settings]]
- ==== General machine learning settings
- `node.ml`::
- Set to `true` (default) to identify the node as a _machine learning node_. +
- +
- If set to `false` in `elasticsearch.yml`, the node cannot run jobs. If set to
- `true` but `xpack.ml.enabled` is set to `false`, the `node.ml` setting is
- ignored and the node cannot run jobs. If you want to run jobs, there must be at
- least one machine learning node in your cluster. +
- +
- IMPORTANT: On dedicated coordinating nodes or dedicated master nodes, disable
- the `node.ml` role.
- `xpack.ml.enabled`::
- Set to `true` (default) to enable {ml} on the node. +
- +
- If set to `false` in `elasticsearch.yml`, the {ml} APIs are disabled on the node.
- Therefore the node cannot open jobs, start {dfeeds}, or receive transport (internal)
- communication requests related to {ml} APIs. It also affects all {kib} instances
- that connect to this {es} instance; you do not need to disable {ml} in those
- `kibana.yml` files. For more information about disabling {ml} in specific {kib}
- instances, see
- {kibana-ref}/ml-settings-kb.html[{kib} Machine Learning Settings].
- +
- IMPORTANT: If you want to use {ml} features in your cluster, you must have
- `xpack.ml.enabled` set to `true` on all master-eligible nodes. This is the
- default behavior.
- `xpack.ml.inference_model.cache_size`::
- The maximum inference cache size allowed. The inference cache exists in the JVM
- heap on each ingest node. The cache affords faster processing times for the
- `inference` processor. The value can be a static byte sized value (i.e. "2gb")
- or a percentage of total allocated heap. The default is "40%".
- `xpack.ml.inference_model.time_to_live`::
- The time to live (TTL) for models in the inference model cache. The TTL is
- calculated from last access. The `inference` processor attempts to load the
- model from cache. If the `inference` processor does not receive any documents
- for the duration of the TTL, the referenced model is flagged for eviction from
- the cache. If a document is processed later, the model is again loaded into the
- cache. Defaults to `5m`.
- `xpack.ml.max_inference_processors` (<<cluster-update-settings,Dynamic>>)::
- The total number of `inference` type processors allowed across all ingest
- pipelines. Once the limit is reached, adding an `inference` processor to
- a pipeline is disallowed. Defaults to `50`.
- `xpack.ml.max_machine_memory_percent` (<<cluster-update-settings,Dynamic>>)::
- The maximum percentage of the machine's memory that {ml} may use for running
- analytics processes. (These processes are separate to the {es} JVM.) Defaults to
- `30` percent. The limit is based on the total memory of the machine, not current
- free memory. Jobs will not be allocated to a node if doing so would cause the
- estimated memory use of {ml} jobs to exceed the limit.
- `xpack.ml.max_model_memory_limit` (<<cluster-update-settings,Dynamic>>)::
- The maximum `model_memory_limit` property value that can be set for any job on
- this node. If you try to create a job with a `model_memory_limit` property value
- that is greater than this setting value, an error occurs. Existing jobs are not
- affected when you update this setting. For more information about the
- `model_memory_limit` property, see <<put-analysislimits>>.
- [[xpack.ml.max_open_jobs]]
- `xpack.ml.max_open_jobs` (<<cluster-update-settings,Dynamic>>)::
- The maximum number of jobs that can run simultaneously on a node. Defaults to
- `20`. In this context, jobs include both {anomaly-jobs} and {dfanalytics-jobs}.
- The maximum number of jobs is also constrained by memory usage. Thus if the
- estimated memory usage of the jobs would be higher than allowed, fewer jobs will
- run on a node. Prior to version 7.1, this setting was a per-node non-dynamic
- setting. It became a cluster-wide dynamic setting in version 7.1. As a result,
- changes to its value after node startup are used only after every node in the
- cluster is running version 7.1 or higher. The maximum permitted value is `512`.
- `xpack.ml.node_concurrent_job_allocations` (<<cluster-update-settings,Dynamic>>)::
- The maximum number of jobs that can concurrently be in the `opening` state on
- each node. Typically, jobs spend a small amount of time in this state before
- they move to `open` state. Jobs that must restore large models when they are
- opening spend more time in the `opening` state. Defaults to `2`.
- [float]
- [[advanced-ml-settings]]
- ==== Advanced machine learning settings
- These settings are for advanced use cases; the default values are generally
- sufficient:
- `xpack.ml.enable_config_migration` (<<cluster-update-settings,Dynamic>>)::
- Reserved.
- `xpack.ml.max_anomaly_records` (<<cluster-update-settings,Dynamic>>)::
- The maximum number of records that are output per bucket. The default value is
- `500`.
- `xpack.ml.max_lazy_ml_nodes` (<<cluster-update-settings,Dynamic>>)::
- The number of lazily spun up Machine Learning nodes. Useful in situations
- where ML nodes are not desired until the first Machine Learning Job
- is opened. It defaults to `0` and has a maximum acceptable value of `3`.
- If the current number of ML nodes is `>=` than this setting, then it is
- assumed that there are no more lazy nodes available as the desired number
- of nodes have already been provisioned. When a job is opened with this
- setting set at `>0` and there are no nodes that can accept the job, then
- the job will stay in the `OPENING` state until a new ML node is added to the
- cluster and the job is assigned to run on that node.
- +
- IMPORTANT: This setting assumes some external process is capable of adding ML nodes
- to the cluster. This setting is only useful when used in conjunction with
- such an external process.
- `xpack.ml.process_connect_timeout` (<<cluster-update-settings,Dynamic>>)::
- The connection timeout for {ml} processes that run separately from the {es} JVM.
- Defaults to `10s`. Some {ml} processing is done by processes that run separately
- to the {es} JVM. When such processes are started they must connect to the {es}
- JVM. If such a process does not connect within the time period specified by this
- setting then the process is assumed to have failed. Defaults to `10s`. The minimum
- value for this setting is `5s`.
|