put-data-frame-analytics.asciidoc 7.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169
  1. --
  2. :api: put-data-frame-analytics
  3. :request: PutDataFrameAnalyticsRequest
  4. :response: PutDataFrameAnalyticsResponse
  5. --
  6. [role="xpack"]
  7. [id="{upid}-{api}"]
  8. === Put {dfanalytics-jobs} API
  9. Creates a new {dfanalytics-job}.
  10. The API accepts a +{request}+ object as a request and returns a +{response}+.
  11. [id="{upid}-{api}-request"]
  12. ==== Put {dfanalytics-jobs} request
  13. A +{request}+ requires the following argument:
  14. ["source","java",subs="attributes,callouts,macros"]
  15. --------------------------------------------------
  16. include-tagged::{doc-tests-file}[{api}-request]
  17. --------------------------------------------------
  18. <1> The configuration of the {dfanalytics-job} to create
  19. [id="{upid}-{api}-config"]
  20. ==== {dfanalytics-cap} configuration
  21. The `DataFrameAnalyticsConfig` object contains all the details about the {dfanalytics-job}
  22. configuration and contains the following arguments:
  23. ["source","java",subs="attributes,callouts,macros"]
  24. --------------------------------------------------
  25. include-tagged::{doc-tests-file}[{api}-config]
  26. --------------------------------------------------
  27. <1> The {dfanalytics-job} ID
  28. <2> The source index and query from which to gather data
  29. <3> The destination index
  30. <4> The analysis to be performed
  31. <5> The fields to be included in / excluded from the analysis
  32. <6> The memory limit for the model created as part of the analysis process
  33. <7> Optionally, a human-readable description
  34. <8> The maximum number of threads to be used by the analysis. Defaults to 1.
  35. [id="{upid}-{api}-query-config"]
  36. ==== SourceConfig
  37. The index and the query from which to collect data.
  38. ["source","java",subs="attributes,callouts,macros"]
  39. --------------------------------------------------
  40. include-tagged::{doc-tests-file}[{api}-source-config]
  41. --------------------------------------------------
  42. <1> Constructing a new DataFrameAnalyticsSource
  43. <2> The source index
  44. <3> The query from which to gather the data. If query is not set, a `match_all` query is used by default.
  45. <4> Source filtering to select which fields will exist in the destination index.
  46. ===== QueryConfig
  47. The query with which to select data from the source.
  48. ["source","java",subs="attributes,callouts,macros"]
  49. --------------------------------------------------
  50. include-tagged::{doc-tests-file}[{api}-query-config]
  51. --------------------------------------------------
  52. ==== DestinationConfig
  53. The index to which data should be written by the {dfanalytics-job}.
  54. ["source","java",subs="attributes,callouts,macros"]
  55. --------------------------------------------------
  56. include-tagged::{doc-tests-file}[{api}-dest-config]
  57. --------------------------------------------------
  58. <1> Constructing a new DataFrameAnalyticsDest
  59. <2> The destination index
  60. ==== Analysis
  61. The analysis to be performed.
  62. Currently, the supported analyses include: +OutlierDetection+, +Classification+, +Regression+.
  63. ===== Outlier detection
  64. +OutlierDetection+ analysis can be created in one of two ways:
  65. ["source","java",subs="attributes,callouts,macros"]
  66. --------------------------------------------------
  67. include-tagged::{doc-tests-file}[{api}-outlier-detection-default]
  68. --------------------------------------------------
  69. <1> Constructing a new OutlierDetection object with default strategy to determine outliers
  70. or
  71. ["source","java",subs="attributes,callouts,macros"]
  72. --------------------------------------------------
  73. include-tagged::{doc-tests-file}[{api}-outlier-detection-customized]
  74. --------------------------------------------------
  75. <1> Constructing a new OutlierDetection object
  76. <2> The method used to perform the analysis
  77. <3> Number of neighbors taken into account during analysis
  78. <4> The min `outlier_score` required to compute feature influence
  79. <5> Whether to compute feature influence
  80. <6> The proportion of the data set that is assumed to be outlying prior to outlier detection
  81. <7> Whether to apply standardization to feature values
  82. ===== Classification
  83. +Classification+ analysis requires to set which is the +dependent_variable+ and
  84. has a number of other optional parameters:
  85. ["source","java",subs="attributes,callouts,macros"]
  86. --------------------------------------------------
  87. include-tagged::{doc-tests-file}[{api}-classification]
  88. --------------------------------------------------
  89. <1> Constructing a new Classification builder object with the required dependent variable
  90. <2> The lambda regularization parameter. A non-negative double.
  91. <3> The gamma regularization parameter. A non-negative double.
  92. <4> The applied shrinkage. A double in [0.001, 1].
  93. <5> The maximum number of trees the forest is allowed to contain. An integer in [1, 2000].
  94. <6> The fraction of features which will be used when selecting a random bag for each candidate split. A double in (0, 1].
  95. <7> If set, feature importance for the top most important features will be computed.
  96. <8> The name of the prediction field in the results object.
  97. <9> The percentage of training-eligible rows to be used in training. Defaults to 100%.
  98. <10> The seed to be used by the random generator that picks which rows are used in training.
  99. <11> The optimization objective to target when assigning class labels. Defaults to maximize_minimum_recall.
  100. <12> The number of top classes to be reported in the results. Defaults to 2.
  101. ===== Regression
  102. +Regression+ analysis requires to set which is the +dependent_variable+ and
  103. has a number of other optional parameters:
  104. ["source","java",subs="attributes,callouts,macros"]
  105. --------------------------------------------------
  106. include-tagged::{doc-tests-file}[{api}-regression]
  107. --------------------------------------------------
  108. <1> Constructing a new Regression builder object with the required dependent variable
  109. <2> The lambda regularization parameter. A non-negative double.
  110. <3> The gamma regularization parameter. A non-negative double.
  111. <4> The applied shrinkage. A double in [0.001, 1].
  112. <5> The maximum number of trees the forest is allowed to contain. An integer in [1, 2000].
  113. <6> The fraction of features which will be used when selecting a random bag for each candidate split. A double in (0, 1].
  114. <7> If set, feature importance for the top most important features will be computed.
  115. <8> The name of the prediction field in the results object.
  116. <9> The percentage of training-eligible rows to be used in training. Defaults to 100%.
  117. <10> The seed to be used by the random generator that picks which rows are used in training.
  118. <11> The loss function used for regression. Defaults to `mse`.
  119. <12> An optional parameter to the loss function.
  120. ==== Analyzed fields
  121. FetchContext object containing fields to be included in / excluded from the analysis
  122. ["source","java",subs="attributes,callouts,macros"]
  123. --------------------------------------------------
  124. include-tagged::{doc-tests-file}[{api}-analyzed-fields]
  125. --------------------------------------------------
  126. include::../execution.asciidoc[]
  127. [id="{upid}-{api}-response"]
  128. ==== Response
  129. The returned +{response}+ contains the newly created {dfanalytics-job}.
  130. ["source","java",subs="attributes,callouts,macros"]
  131. --------------------------------------------------
  132. include-tagged::{doc-tests-file}[{api}-response]
  133. --------------------------------------------------