|
@@ -18,10 +18,36 @@
|
|
|
(object) You can specify both `includes` and/or `excludes` patterns. If
|
|
|
`analyzed_fields` is not set, only the relevant fields will be included. For
|
|
|
example all the numeric fields for {oldetection}.
|
|
|
+
|
|
|
+[source,js]
|
|
|
+--------------------------------------------------
|
|
|
+PUT _ml/data_frame/analytics/loganalytics
|
|
|
+{
|
|
|
+ "source": {
|
|
|
+ "index": "logdata"
|
|
|
+ },
|
|
|
+ "dest": {
|
|
|
+ "index": "logdata_out"
|
|
|
+ },
|
|
|
+ "analysis": {
|
|
|
+ "outlier_detection": {
|
|
|
+ }
|
|
|
+ },
|
|
|
+ "analyzed_fields": {
|
|
|
+ "includes": [ "request.bytes", "response.counts.error" ],
|
|
|
+ "excludes": [ "source.geo" ]
|
|
|
+ }
|
|
|
+}
|
|
|
+--------------------------------------------------
|
|
|
+// CONSOLE
|
|
|
+// TEST[setup:setup_logdata]
|
|
|
|
|
|
`dest`::
|
|
|
- (object) The destination configuration of the analysis. For more information,
|
|
|
- see <<dfanalytics-dest-resources>>.
|
|
|
+ (object) The destination configuration of the analysis. The `index` property
|
|
|
+ (string) is the name of the index in which to store the results of the
|
|
|
+ {dfanalytics-job}. The `results_field` (string) property defines the name of
|
|
|
+ the field in which to store the results of the analysis. The default value is
|
|
|
+ `ml`.
|
|
|
|
|
|
`id`::
|
|
|
(string) The unique identifier for the {dfanalytics-job}. This identifier can
|
|
@@ -38,25 +64,29 @@
|
|
|
that setting. For more information, see <<ml-settings>>.
|
|
|
|
|
|
`source`::
|
|
|
- (object) The source configuration, consisting of `index` and optionally a
|
|
|
- `query`. For more information, see <<dfanalytics-source-resources>>.
|
|
|
+ (object) The source configuration, consisting of `index` (array) which is an
|
|
|
+ array of index names on which to perform the analysis. It can be a single
|
|
|
+ index or index pattern as well as an array of indices or patterns. Optionally,
|
|
|
+ `source` can have a `query` (object) property. The {es} query domain-specific
|
|
|
+ language (DSL). This value corresponds to the query object in an {es} search
|
|
|
+ POST body. All the options that are supported by {es} can be used, as this
|
|
|
+ object is passed verbatim to {es}. By default, this property has the following
|
|
|
+ value: `{"match_all": {}}`.
|
|
|
|
|
|
[[dfanalytics-types]]
|
|
|
==== Analysis objects
|
|
|
|
|
|
{dfanalytics-cap} resources contain `analysis` objects. For example, when you
|
|
|
-create a {dfanalytics-job}, you must define the type of analysis it performs.
|
|
|
+create a {dfanalytics-job}, you must define the type of analysis it performs.
|
|
|
+Currently, `outlier_detection` is the only available type of analysis, however,
|
|
|
+other types will be added, for example `regression`.
|
|
|
|
|
|
[discrete]
|
|
|
[[oldetection-resources]]
|
|
|
-===== {oldetection-cap} configuration objects
|
|
|
+==== {oldetection-cap} configuration objects
|
|
|
|
|
|
An {oldetection} configuration object has the following properties:
|
|
|
|
|
|
-[discrete]
|
|
|
-[[oldetection-properties]]
|
|
|
-==== {api-definitions-title}
|
|
|
-
|
|
|
`n_neighbors`::
|
|
|
(integer) Defines the value for how many nearest neighbors each method of
|
|
|
{oldetection} will use to calculate its {olscore}. When the value is
|
|
@@ -65,44 +95,11 @@ An {oldetection} configuration object has the following properties:
|
|
|
`method`::
|
|
|
(string) Sets the method that {oldetection} uses. If the method is not set
|
|
|
{oldetection} uses an ensemble of different methods and normalises and
|
|
|
- combines their individual {olscores} to obtain the overall {olscore}.
|
|
|
- Available methods are `lof`, `ldof`, `distance_kth_nn`, `distance_knn`.
|
|
|
+ combines their individual {olscores} to obtain the overall {olscore}. We
|
|
|
+ recommend to use the ensemble method. Available methods are `lof`, `ldof`,
|
|
|
+ `distance_kth_nn`, `distance_knn`.
|
|
|
|
|
|
`feature_influence_threshold`::
|
|
|
(double) The minimum {olscore} that a document needs to have in order to
|
|
|
calculate its {fiscore}.
|
|
|
- Value range: 0-1 (`0.1` by default).
|
|
|
-
|
|
|
-[[dfanalytics-dest-resources]]
|
|
|
-==== Dest configuration objects
|
|
|
-
|
|
|
-{dfanalytics-cap} resources contain `dest` objects. For example, when you
|
|
|
-create a {dfanalytics-job}, you must define its destination.
|
|
|
-
|
|
|
-[discrete]
|
|
|
-[[dfanalytics-dest-properties]]
|
|
|
-==== {api-definitions-title}
|
|
|
-
|
|
|
-`index`::
|
|
|
- (string) The name of the index in which to store the results of the
|
|
|
- {dfanalytics-job}.
|
|
|
-
|
|
|
-`results_field`::
|
|
|
- (string) The name of the field in which to store the results of the analysis.
|
|
|
- The default value is `ml`.
|
|
|
-
|
|
|
-[[dfanalytics-source-resources]]
|
|
|
-==== Source configuration objects
|
|
|
-
|
|
|
-The `source` configuration object has the following properties:
|
|
|
-
|
|
|
-`index`::
|
|
|
- (array) An array of index names on which to perform the analysis. It can be a
|
|
|
- single index or index pattern as well as an array of indices or patterns.
|
|
|
-
|
|
|
-`query`::
|
|
|
- (object) The {es} query domain-specific language (DSL). This value
|
|
|
- corresponds to the query object in an {es} search POST body. All the
|
|
|
- options that are supported by {es} can be used, as this object is
|
|
|
- passed verbatim to {es}. By default, this property has the following
|
|
|
- value: `{"match_all": {}}`.
|
|
|
+ Value range: 0-1 (`0.1` by default).
|