|
@@ -108,10 +108,13 @@ other types will be added, for example `regression`.
|
|
|
|
|
|
An {oldetection} configuration object has the following properties:
|
|
|
|
|
|
-`n_neighbors`::
|
|
|
- (integer) Defines the value for how many nearest neighbors each method of
|
|
|
- {oldetection} will use to calculate its {olscore}. When the value is
|
|
|
- not set, the system will dynamically detect an appropriate value.
|
|
|
+`compute_feature_influence`::
|
|
|
+ (boolean) If `true`, the feature influence calculation is enabled. Defaults to
|
|
|
+ `true`.
|
|
|
+
|
|
|
+`feature_influence_threshold`::
|
|
|
+ (double) The minimum {olscore} that a document needs to have in order to
|
|
|
+ calculate its {fiscore}. Value range: 0-1 (`0.1` by default).
|
|
|
|
|
|
`method`::
|
|
|
(string) Sets the method that {oldetection} uses. If the method is not set
|
|
@@ -119,8 +122,21 @@ An {oldetection} configuration object has the following properties:
|
|
|
combines their individual {olscores} to obtain the overall {olscore}. We
|
|
|
recommend to use the ensemble method. Available methods are `lof`, `ldof`,
|
|
|
`distance_kth_nn`, `distance_knn`.
|
|
|
-
|
|
|
-`feature_influence_threshold`::
|
|
|
- (double) The minimum {olscore} that a document needs to have in order to
|
|
|
- calculate its {fiscore}.
|
|
|
- Value range: 0-1 (`0.1` by default).
|
|
|
+
|
|
|
+`n_neighbors`::
|
|
|
+ (integer) Defines the value for how many nearest neighbors each method of
|
|
|
+ {oldetection} will use to calculate its {olscore}. When the value is not set,
|
|
|
+ different values will be used for different ensemble members. This helps
|
|
|
+ improve diversity in the ensemble. Therefore, only override this if you are
|
|
|
+ confident that the value you choose is appropriate for the data set.
|
|
|
+
|
|
|
+`outlier_fraction`::
|
|
|
+ (double) Sets the proportion of the data set that is assumed to be outlying prior to
|
|
|
+ {oldetection}. For example, 0.05 means it is assumed that 5% of values are real outliers
|
|
|
+ and 95% are inliers.
|
|
|
+
|
|
|
+`standardize_columns`::
|
|
|
+ (boolean) If `true`, then the following operation is performed on the columns
|
|
|
+ before computing outlier scores: (x_i - mean(x_i)) / sd(x_i). Defaults to
|
|
|
+ `true`. For more information, see
|
|
|
+ https://en.wikipedia.org/wiki/Feature_scaling#Standardization_(Z-score_Normalization)[this wiki page about standardization].
|