123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180 |
- [role="xpack"]
- [testenv="platinum"]
- [[explain-dfanalytics]]
- = Explain {dfanalytics} API
- [subs="attributes"]
- ++++
- <titleabbrev>Explain {dfanalytics} API</titleabbrev>
- ++++
- Explains a {dataframe-analytics-config}.
- experimental[]
- [[ml-explain-dfanalytics-request]]
- == {api-request-title}
- `GET _ml/data_frame/analytics/_explain` +
- `POST _ml/data_frame/analytics/_explain` +
- `GET _ml/data_frame/analytics/<data_frame_analytics_id>/_explain` +
- `POST _ml/data_frame/analytics/<data_frame_analytics_id>/_explain`
- [[ml-explain-dfanalytics-prereq]]
- == {api-prereq-title}
- If the {es} {security-features} are enabled, you must have the following
- privileges:
- * cluster: `monitor_ml`
-
- For more information, see <<security-privileges>> and {ml-docs-setup-privileges}.
- [[ml-explain-dfanalytics-desc]]
- == {api-description-title}
- This API provides explanations for a {dataframe-analytics-config} that either
- exists already or one that has not been created yet.
- The following explanations are provided:
- * which fields are included or not in the analysis and why,
- * how much memory is estimated to be required. The estimate can be used when
- deciding the appropriate value for `model_memory_limit` setting later on.
- If you have object fields or fields that are excluded via source filtering,
- they are not included in the explanation.
- [[ml-explain-dfanalytics-path-params]]
- == {api-path-parms-title}
- `<data_frame_analytics_id>`::
- (Optional, string)
- include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-data-frame-analytics]
- [[ml-explain-dfanalytics-request-body]]
- == {api-request-body-title}
- A {dataframe-analytics-config} as described in <<put-dfanalytics>>.
- Note that `id` and `dest` don't need to be provided in the context of this API.
- [role="child_attributes"]
- [[ml-explain-dfanalytics-results]]
- == {api-response-body-title}
- The API returns a response that contains the following:
- `field_selection`::
- (array)
- An array of objects that explain selection for each field, sorted by
- the field names.
- +
- .Properties of `field_selection` objects
- [%collapsible%open]
- ====
- `is_included`:::
- (boolean) Whether the field is selected to be included in the analysis.
- `is_required`:::
- (boolean) Whether the field is required.
- `feature_type`:::
- (string) The feature type of this field for the analysis. May be `categorical`
- or `numerical`.
- `mapping_types`:::
- (string) The mapping types of the field.
- `name`:::
- (string) The field name.
- `reason`:::
- (string) The reason a field is not selected to be included in the analysis.
- ====
- `memory_estimation`::
- (object)
- An object containing the memory estimates.
- +
- .Properties of `memory_estimation`
- [%collapsible%open]
- ====
- `expected_memory_with_disk`:::
- (string) Estimated memory usage under the assumption that overflowing to disk is
- allowed during {dfanalytics}. `expected_memory_with_disk` is usually smaller
- than `expected_memory_without_disk` as using disk allows to limit the main
- memory needed to perform {dfanalytics}.
- `expected_memory_without_disk`:::
- (string) Estimated memory usage under the assumption that the whole
- {dfanalytics} should happen in memory (i.e. without overflowing to disk).
- ====
- [[ml-explain-dfanalytics-example]]
- == {api-examples-title}
- [source,console]
- --------------------------------------------------
- POST _ml/data_frame/analytics/_explain
- {
- "source": {
- "index": "houses_sold_last_10_yrs"
- },
- "analysis": {
- "regression": {
- "dependent_variable": "price"
- }
- }
- }
- --------------------------------------------------
- // TEST[skip:TBD]
- The API returns the following results:
- [source,console-result]
- ----
- {
- "field_selection": [
- {
- "field": "number_of_bedrooms",
- "mappings_types": ["integer"],
- "is_included": true,
- "is_required": false,
- "feature_type": "numerical"
- },
- {
- "field": "postcode",
- "mappings_types": ["text"],
- "is_included": false,
- "is_required": false,
- "reason": "[postcode.keyword] is preferred because it is aggregatable"
- },
- {
- "field": "postcode.keyword",
- "mappings_types": ["keyword"],
- "is_included": true,
- "is_required": false,
- "feature_type": "categorical"
- },
- {
- "field": "price",
- "mappings_types": ["float"],
- "is_included": true,
- "is_required": true,
- "feature_type": "numerical"
- }
- ],
- "memory_estimation": {
- "expected_memory_without_disk": "128MB",
- "expected_memory_with_disk": "32MB"
- }
- }
- ----
|