| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161 | 
[role="xpack"][[ml-settings]]=== Machine learning settings in Elasticsearch++++<titleabbrev>Machine learning settings</titleabbrev>++++You do not need to configure any settings to use {ml}. It is enabled by default.IMPORTANT: {ml-cap} uses SSE4.2 instructions, so will only work on machines whoseCPUs https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs[support] SSE4.2. If yourun {es} on older hardware you must disable {ml} (by setting `xpack.ml.enabled`to `false`).All of these settings can be added to the `elasticsearch.yml` configuration file. The dynamic settings can also be updated across a cluster with the <<cluster-update-settings,cluster update settings API>>.TIP: Dynamic settings take precedence over settings in the `elasticsearch.yml` file.[float][[general-ml-settings]]==== General machine learning settings`node.ml`::Set to `true` (default) to identify the node as a _machine learning node_. ++If set to `false` in `elasticsearch.yml`, the node cannot run jobs. If set to`true` but `xpack.ml.enabled` is set to `false`, the `node.ml` setting isignored and the node cannot run jobs. If you want to run jobs, there must be atleast one machine learning node in your cluster. ++IMPORTANT: On dedicated coordinating nodes or dedicated master nodes, disablethe `node.ml` role.`xpack.ml.enabled`::Set to `true` (default) to enable {ml} on the node.+If set to `false`, the {ml} APIs are disabled on the node. Therefore the nodecannot open jobs, start {dfeeds}, or receive transport (internal) communicationrequests related to {ml} APIs. If the node is a coordinating node, {ml} requestsfrom clients (including {kib}) also fail. For more information about disabling{ml} in specific {kib} instances, see{kibana-ref}/ml-settings-kb.html[{kib} {ml} settings].+IMPORTANT: If you want to use {ml-features} in your cluster, it is recommendedthat you set `xpack.ml.enabled` to `true` on all nodes. This is thedefault behavior. At a minimum, it must be enabled on all master-eligible nodes.If you want to use {ml-features} in clients or {kib}, it must also be enabled onall coordinating nodes.`xpack.ml.inference_model.cache_size`::The maximum inference cache size allowed. The inference cache exists in the JVMheap on each ingest node. The cache affords faster processing times for the`inference` processor. The value can be a static byte sized value (i.e. "2gb")or a percentage of total allocated heap. The default is "40%".See also <<model-inference-circuit-breaker>>.`xpack.ml.inference_model.time_to_live`::The time to live (TTL) for models in the inference model cache. The TTL iscalculated from last access. The `inference` processor attempts to load themodel from cache. If the `inference` processor does not receive any documentsfor the duration of the TTL, the referenced model is flagged for eviction fromthe cache. If a document is processed later, the model is again loaded into thecache. Defaults to `5m`.`xpack.ml.max_inference_processors` (<<cluster-update-settings,Dynamic>>)::The total number of `inference` type processors allowed across all ingestpipelines. Once the limit is reached, adding an `inference` processor toa pipeline is disallowed. Defaults to `50`.`xpack.ml.max_machine_memory_percent` (<<cluster-update-settings,Dynamic>>)::The maximum percentage of the machine's memory that {ml} may use for runninganalytics processes. (These processes are separate to the {es} JVM.) Defaults to`30` percent. The limit is based on the total memory of the machine, not currentfree memory. Jobs will not be allocated to a node if doing so would cause theestimated memory use of {ml} jobs to exceed the limit.`xpack.ml.max_model_memory_limit` (<<cluster-update-settings,Dynamic>>)::The maximum `model_memory_limit` property value that can be set for any job onthis node. If you try to create a job with a `model_memory_limit` property valuethat is greater than this setting value, an error occurs. Existing jobs are notaffected when you update this setting. For more information about the`model_memory_limit` property, see <<put-analysislimits>>.[[xpack.ml.max_open_jobs]]`xpack.ml.max_open_jobs` (<<cluster-update-settings,Dynamic>>)::The maximum number of jobs that can run simultaneously on a node. Defaults to`20`. In this context, jobs include both {anomaly-jobs} and {dfanalytics-jobs}. The maximum number of jobs is also constrained by memory usage. Thus if the estimated memory usage of the jobs would be higher than allowed, fewer jobs will run on a node. Prior to version 7.1, this setting was a per-node non-dynamic setting. It became a cluster-wide dynamic setting in version 7.1. As a result, changes to its value after node startup are used only after every node in the cluster is running version 7.1 or higher. The maximum permitted value is `512`.`xpack.ml.node_concurrent_job_allocations` (<<cluster-update-settings,Dynamic>>)::The maximum number of jobs that can concurrently be in the `opening` state oneach node. Typically, jobs spend a small amount of time in this state beforethey move to `open` state. Jobs that must restore large models when they areopening spend more time in the `opening` state. Defaults to `2`.[float][[advanced-ml-settings]]==== Advanced machine learning settingsThese settings are for advanced use cases; the default values are generally sufficient:`xpack.ml.enable_config_migration` (<<cluster-update-settings,Dynamic>>)::Reserved.`xpack.ml.max_anomaly_records` (<<cluster-update-settings,Dynamic>>)::The maximum number of records that are output per bucket. The default value is `500`.`xpack.ml.max_lazy_ml_nodes` (<<cluster-update-settings,Dynamic>>)::The number of lazily spun up Machine Learning nodes. Useful in situationswhere ML nodes are not desired until the first Machine Learning Jobis opened. It defaults to `0` and has a maximum acceptable value of `3`.If the current number of ML nodes is `>=` than this setting, then it isassumed that there are no more lazy nodes available as the desired numberof nodes have already been provisioned. When a job is opened with thissetting set at `>0` and there are no nodes that can accept the job, thenthe job will stay in the `OPENING` state until a new ML node is added to thecluster and the job is assigned to run on that node.+IMPORTANT: This setting assumes some external process is capable of adding ML nodesto the cluster. This setting is only useful when used in conjunction withsuch an external process.`xpack.ml.process_connect_timeout` (<<cluster-update-settings,Dynamic>>)::The connection timeout for {ml} processes that run separately from the {es} JVM.Defaults to `10s`. Some {ml} processing is done by processes that run separatelyto the {es} JVM. When such processes are started they must connect to the {es}JVM. If such a process does not connect within the time period specified by thissetting then the process is assumed to have failed. Defaults to `10s`. The minimumvalue for this setting is `5s`.[[model-inference-circuit-breaker]]==== {ml-cap} circuit breaker settings`breaker.model_inference.limit` (<<cluster-update-settings,Dynamic>>)Limit for model inference breaker, defaults to 50% of JVM heap.If the parent circuit breaker is less than 50% of JVM heap, it is boundto that limit instead.See <<circuit-breaker>>.`breaker.model_inference.overhead` (<<cluster-update-settings,Dynamic>>)A constant that all accounting estimations are multiplied with to determinea final estimation. Defaults to 1.See <<circuit-breaker>>.`breaker.model_inference.type`The underlying type of the circuit breaker. There are two valid options:`noop`, meaning the circuit breaker does nothing to prevent too much memory usage,`memory`, meaning the circuit breaker tracks the memory used by inference models andcould potentially break and prevent OutOfMemory errors.The default is `memory`.
 |