5 years ago · e12d6f168c
--- a/docs/reference/settings/ml-settings.asciidoc
+++ b/docs/reference/settings/ml-settings.asciidoc
@@ -10,15 +10,9 @@
 
				 // tag::ml-settings-description-tag[]
			
 
				 You do not need to configure any settings to use {ml}. It is enabled by default.
			
 
				 
			
 
				-IMPORTANT: {ml-cap} uses SSE4.2 instructions, so will only work on machines whose
			
 
				-CPUs {wikipedia}/SSE4#Supporting_CPUs[support] SSE4.2. If you
			
 
				-run {es} on older hardware you must disable {ml} (by setting `xpack.ml.enabled`
			
 
				-to `false`).
			
 
				-
			
 
				-All of these settings can be added to the `elasticsearch.yml` configuration 
			
 
				-file. The dynamic settings can also be updated across a cluster with the
			
 
				-<<cluster-update-settings,cluster update settings API>>. Dynamic settings take 
			
 
				-precedence over settings in the `elasticsearch.yml` file.
			
 
				+IMPORTANT: {ml-cap} uses SSE4.2 instructions, so it works only on machines whose
			
 
				+CPUs {wikipedia}/SSE4#Supporting_CPUs[support] SSE4.2. If you run {es} on older
			
 
				+hardware, you must disable {ml} (by setting `xpack.ml.enabled` to `false`).
			
 
				 
			
 
				 // end::ml-settings-description-tag[]
			
 
				 
			
@@ -27,8 +21,9 @@ precedence over settings in the `elasticsearch.yml` file.
 
				 ==== General machine learning settings
			
 
				 
			
 
				 `node.roles: [ ml ]`::
			
 
				-Set `node.roles` to contain `ml` to identify the node as a _{ml} node_ that is 
			
 
				-capable of running jobs. Every node is a {ml} node by default.+
			
 
				+(<<static-cluster-setting,Static>>) Set `node.roles` to contain `ml` to identify
			
 
				+the node as a _{ml} node_ that is capable of running jobs. Every node is a {ml}
			
 
				+node by default.
			
 
				 +
			
 
				 If you use the `node.roles` setting, then all required roles must be explicitly  
			
 
				 set. Consult <<modules-node>> to learn more.
			
@@ -38,7 +33,8 @@ the `ml` role.
 
				 
			
 
				 
			
 
				 `xpack.ml.enabled`::
			
 
				-Set to `true` (default) to enable {ml} APIs on the node.
			
 
				+(<<static-cluster-setting,Static>>) Set to `true` (default) to enable {ml} APIs
			
 
				+on the node.
			
 
				 +
			
 
				 If set to `false`, the {ml} APIs are disabled on the node. Therefore the node
			
 
				 cannot open jobs, start {dfeeds}, or receive transport (internal) communication
			
@@ -54,58 +50,62 @@ want to use {ml-features} in clients or {kib}, it must also be enabled on all
 
				 coordinating nodes.
			
 
				 
			
 
				 `xpack.ml.inference_model.cache_size`::
			
 
				-The maximum inference cache size allowed. The inference cache exists in the JVM
			
 
				-heap on each ingest node. The cache affords faster processing times for the
			
 
				-`inference` processor. The value can be a static byte sized value (i.e. "2gb")
			
 
				-or a percentage of total allocated heap. The default is "40%".
			
 
				-See also <<model-inference-circuit-breaker>>.
			
 
				+(<<static-cluster-setting,Static>>) The maximum inference cache size allowed.
			
 
				+The inference cache exists in the JVM heap on each ingest node. The cache
			
 
				+affords faster processing times for the `inference` processor. The value can be
			
 
				+a static byte sized value (i.e. "2gb") or a percentage of total allocated heap.
			
 
				+The default is "40%". See also <<model-inference-circuit-breaker>>.
			
 
				 
			
 
				 [[xpack-interference-model-ttl]]
			
 
				 // tag::interference-model-ttl-tag[]
			
 
				 `xpack.ml.inference_model.time_to_live` {ess-icon}::
			
 
				-The time to live (TTL) for models in the inference model cache. The TTL is
			
 
				-calculated from last access. The `inference` processor attempts to load the
			
 
				-model from cache. If the `inference` processor does not receive any documents
			
 
				-for the duration of the TTL, the referenced model is flagged for eviction from
			
 
				-the cache. If a document is processed later, the model is again loaded into the
			
 
				-cache. Defaults to `5m`.
			
 
				+(<<static-cluster-setting,Static>>) The time to live (TTL) for models in the
			
 
				+inference model cache. The TTL is calculated from last access. The `inference`
			
 
				+processor attempts to load the model from cache. If the `inference` processor
			
 
				+does not receive any documents for the duration of the TTL, the referenced model
			
 
				+is flagged for eviction from the cache. If a document is processed later, the
			
 
				+model is again loaded into the cache. Defaults to `5m`.
			
 
				 // end::interference-model-ttl-tag[]
			
 
				 
			
 
				-`xpack.ml.max_inference_processors` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The total number of `inference` type processors allowed across all ingest
			
 
				-pipelines. Once the limit is reached, adding an `inference` processor to
			
 
				-a pipeline is disallowed. Defaults to `50`.
			
 
				-
			
 
				-`xpack.ml.max_machine_memory_percent` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The maximum percentage of the machine's memory that {ml} may use for running
			
 
				-analytics processes. (These processes are separate to the {es} JVM.) Defaults to
			
 
				-`30` percent. The limit is based on the total memory of the machine, not current
			
 
				-free memory. Jobs will not be allocated to a node if doing so would cause the
			
 
				-estimated memory use of {ml} jobs to exceed the limit.
			
 
				-
			
 
				-`xpack.ml.max_model_memory_limit` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The maximum `model_memory_limit` property value that can be set for any job on
			
 
				-this node. If you try to create a job with a `model_memory_limit` property value
			
 
				-that is greater than this setting value, an error occurs. Existing jobs are not
			
 
				-affected when you update this setting. For more information about the
			
 
				-`model_memory_limit` property, see <<put-analysislimits>>.
			
 
				+`xpack.ml.max_inference_processors`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The total number of `inference` type
			
 
				+processors allowed across all ingest pipelines. Once the limit is reached,
			
 
				+adding an `inference` processor to a pipeline is disallowed. Defaults to `50`.
			
 
				+
			
 
				+`xpack.ml.max_machine_memory_percent`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The maximum percentage of the machine's
			
 
				+memory that {ml} may use for running analytics processes. (These processes are
			
 
				+separate to the {es} JVM.) Defaults to `30` percent. The limit is based on the
			
 
				+total memory of the machine, not current free memory. Jobs are not allocated to
			
 
				+a node if doing so would cause the estimated memory use of {ml} jobs to exceed
			
 
				+the limit.
			
 
				+
			
 
				+`xpack.ml.max_model_memory_limit`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The maximum `model_memory_limit` property
			
 
				+value that can be set for any job on this node. If you try to create a job with
			
 
				+a `model_memory_limit` property value that is greater than this setting value,
			
 
				+an error occurs. Existing jobs are not affected when you update this setting.
			
 
				+For more information about the `model_memory_limit` property, see
			
 
				+<<put-analysislimits>>.
			
 
				 
			
 
				 [[xpack.ml.max_open_jobs]]
			
 
				-`xpack.ml.max_open_jobs` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The maximum number of jobs that can run simultaneously on a node. Defaults to
			
 
				-`20`. In this context, jobs include both {anomaly-jobs} and {dfanalytics-jobs}.
			
 
				-The maximum number of jobs is also constrained by memory usage. Thus if the
			
 
				-estimated memory usage of the jobs would be higher than allowed, fewer jobs will
			
 
				-run on a node. Prior to version 7.1, this setting was a per-node non-dynamic
			
 
				-setting. It became a cluster-wide dynamic setting in version 7.1. As a result,
			
 
				-changes to its value after node startup are used only after every node in the
			
 
				-cluster is running version 7.1 or higher. The maximum permitted value is `512`.
			
 
				-
			
 
				-`xpack.ml.node_concurrent_job_allocations` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The maximum number of jobs that can concurrently be in the `opening` state on
			
 
				-each node. Typically, jobs spend a small amount of time in this state before
			
 
				-they move to `open` state. Jobs that must restore large models when they are
			
 
				-opening spend more time in the `opening` state. Defaults to `2`.
			
 
				+`xpack.ml.max_open_jobs`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can run 
			
 
				+simultaneously on a node. Defaults to `20`. In this context, jobs include both
			
 
				+{anomaly-jobs} and {dfanalytics-jobs}. The maximum number of jobs is also
			
 
				+constrained by memory usage. Thus if the estimated memory usage of the jobs
			
 
				+would be higher than allowed, fewer jobs will run on a node. Prior to version
			
 
				+7.1, this setting was a per-node non-dynamic setting. It became a cluster-wide
			
 
				+dynamic setting in version 7.1. As a result, changes to its value after node
			
 
				+startup are used only after every node in the cluster is running version 7.1 or
			
 
				+higher. The maximum permitted value is `512`.
			
 
				+
			
 
				+`xpack.ml.node_concurrent_job_allocations`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can
			
 
				+concurrently be in the `opening` state on each node. Typically, jobs spend a
			
 
				+small amount of time in this state before they move to `open` state. Jobs that
			
 
				+must restore large models when they are opening spend more time in the `opening`
			
 
				+state. Defaults to `2`.
			
 
				 
			
 
				 [discrete]
			
 
				 [[advanced-ml-settings]]
			
@@ -114,52 +114,55 @@ opening spend more time in the `opening` state. Defaults to `2`.
 
				 These settings are for advanced use cases; the default values are generally
			
 
				 sufficient:
			
 
				 
			
 
				-`xpack.ml.enable_config_migration` (<<cluster-update-settings,Dynamic>>)::
			
 
				-Reserved.
			
 
				+`xpack.ml.enable_config_migration`::
			
 
				+(<<cluster-update-settings,Dynamic>>) Reserved.
			
 
				 
			
 
				-`xpack.ml.max_anomaly_records` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The maximum number of records that are output per bucket. The default value is
			
 
				-`500`.
			
 
				+`xpack.ml.max_anomaly_records`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The maximum number of records that are
			
 
				+output per bucket. The default value is `500`.
			
 
				 
			
 
				-`xpack.ml.max_lazy_ml_nodes` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The number of lazily spun up Machine Learning nodes. Useful in situations
			
 
				-where ML nodes are not desired until the first Machine Learning Job
			
 
				-is opened. It defaults to `0` and has a maximum acceptable value of `3`.
			
 
				-If the current number of ML nodes is `>=` than this setting, then it is
			
 
				+`xpack.ml.max_lazy_ml_nodes`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The number of lazily spun up {ml} nodes.
			
 
				+Useful in situations where {ml} nodes are not desired until the first {ml} job
			
 
				+opens. It defaults to `0` and has a maximum acceptable value of `3`. If the
			
 
				+current number of {ml} nodes is greater than or equal to this setting, it is
			
 
				 assumed that there are no more lazy nodes available as the desired number
			
 
				-of nodes have already been provisioned. When a job is opened with this
			
 
				-setting set at `>0` and there are no nodes that can accept the job, then
			
 
				-the job will stay in the `OPENING` state until a new ML node is added to the
			
 
				-cluster and the job is assigned to run on that node.
			
 
				+of nodes have already been provisioned. If a job is opened and this setting has
			
 
				+a value greater than zero and there are no nodes that can accept the job, the
			
 
				+job stays in the `OPENING` state until a new {ml} node is added to the cluster
			
 
				+and the job is assigned to run on that node.
			
 
				 +
			
 
				-IMPORTANT: This setting assumes some external process is capable of adding ML nodes
			
 
				-to the cluster. This setting is only useful when used in conjunction with
			
 
				+IMPORTANT: This setting assumes some external process is capable of adding {ml}
			
 
				+nodes to the cluster. This setting is only useful when used in conjunction with
			
 
				 such an external process.
			
 
				 
			
 
				-`xpack.ml.process_connect_timeout` (<<cluster-update-settings,Dynamic>>)::
			
 
				-The connection timeout for {ml} processes that run separately from the {es} JVM.
			
 
				-Defaults to `10s`. Some {ml} processing is done by processes that run separately
			
 
				-to the {es} JVM. When such processes are started they must connect to the {es}
			
 
				-JVM. If such a process does not connect within the time period specified by this
			
 
				-setting then the process is assumed to have failed. Defaults to `10s`. The minimum
			
 
				-value for this setting is `5s`.
			
 
				+`xpack.ml.process_connect_timeout`::
			
 
				+(<<cluster-update-settings,Dynamic>>) The connection timeout for {ml} processes
			
 
				+that run separately from the {es} JVM. Defaults to `10s`. Some {ml} processing
			
 
				+is done by processes that run separately to the {es} JVM. When such processes
			
 
				+are started they must connect to the {es} JVM. If such a process does not
			
 
				+connect within the time period specified by this setting then the process is
			
 
				+assumed to have failed. Defaults to `10s`. The minimum value for this setting is
			
 
				+`5s`.
			
 
				 
			
 
				 [discrete]
			
 
				 [[model-inference-circuit-breaker]]
			
 
				 ==== {ml-cap} circuit breaker settings
			
 
				 
			
 
				-`breaker.model_inference.limit` (<<cluster-update-settings,Dynamic>>)::
			
 
				-Limit for the model inference breaker, which defaults to 50% of the JVM heap.
			
 
				-If the parent circuit breaker is less than 50% of the JVM heap, it is bound
			
 
				-to that limit instead. See <<circuit-breaker>>.
			
 
				+`breaker.model_inference.limit`::
			
 
				+(<<cluster-update-settings,Dynamic>>) Limit for the model inference breaker,
			
 
				+which defaults to 50% of the JVM heap. If the parent circuit breaker is less
			
 
				+than 50% of the JVM heap, it is bound to that limit instead. See
			
 
				+<<circuit-breaker>>.
			
 
				 
			
 
				-`breaker.model_inference.overhead` (<<cluster-update-settings,Dynamic>>)::
			
 
				-A constant that all accounting estimations are multiplied by to determine
			
 
				-a final estimation. Defaults to 1. See <<circuit-breaker>>.
			
 
				+`breaker.model_inference.overhead`::
			
 
				+(<<cluster-update-settings,Dynamic>>) A constant that all accounting estimations
			
 
				+are multiplied by to determine a final estimation. Defaults to 1. See
			
 
				+<<circuit-breaker>>.
			
 
				 
			
 
				 `breaker.model_inference.type`::
			
 
				-The underlying type of the circuit breaker. There are two valid options: `noop`
			
 
				-and `memory`. `noop` means the circuit breaker does nothing to prevent too much
			
 
				-memory usage. `memory` means the circuit breaker tracks the memory used by
			
 
				-inference models and can potentially break and prevent OutOfMemory errors. The
			
 
				-default is `memory`.
			
 
				+(<<static-cluster-setting,Static>>) The underlying type of the circuit breaker.
			
 
				+There are two valid options: `noop` and `memory`. `noop` means the circuit
			
 
				+breaker does nothing to prevent too much memory usage. `memory` means the
			
 
				+circuit breaker tracks the memory used by inference models and can potentially
			
 
				+break and prevent `OutOfMemory` errors. The default is `memory`.
			
--- a/docs/reference/settings/transform-settings.asciidoc
+++ b/docs/reference/settings/transform-settings.asciidoc
@@ -10,11 +10,6 @@
 
				 You do not need to configure any settings to use {transforms}. It is enabled by
			
 
				 default.
			
 
				 
			
 
				-All of these settings can be added to the `elasticsearch.yml` configuration file.
			
 
				-The dynamic settings can also be updated across a cluster with the
			
 
				-<<cluster-update-settings,cluster update settings API>>. Dynamic settings take
			
 
				-precedence over settings in the `elasticsearch.yml` file.
			
 
				-
			
 
				 [discrete]
			
 
				 [[general-transform-settings]]
			
 
				 ==== General {transforms} settings