فهرست منبع

Merge pull request #15405 from alexg-dev/patch-1

More detailed explanation of some similarity types
Clinton Gormley 9 سال پیش
والد
کامیت
f20f41e02e
1فایلهای تغییر یافته به همراه6 افزوده شده و 3 حذف شده
  1. 6 3
      docs/reference/index-modules/similarity.asciidoc

+ 6 - 3
docs/reference/index-modules/similarity.asciidoc

@@ -112,7 +112,10 @@ Type name: `DFR`
 ==== IB similarity.
 
 http://lucene.apache.org/core/5_2_1/core/org/apache/lucene/search/similarities/IBSimilarity.html[Information
-based model] . This similarity has the following options:
+based model] . The algorithm is based on the concept that the information content in any symbolic 'distribution'
+sequence is primarily determined by the repetitive usage of its basic elements.
+For written texts this challenge would correspond to comparing the writing styles of diferent authors.
+This similarity has the following options:
 
 [horizontal]
 `distribution`::  Possible values: `ll` and `spl`.
@@ -138,11 +141,11 @@ Type name: `LMDirichlet`
 ==== LM Jelinek Mercer similarity.
 
 http://lucene.apache.org/core/5_2_1/core/org/apache/lucene/search/similarities/LMJelinekMercerSimilarity.html[LM
-Jelinek Mercer similarity] . This similarity has the following options:
+Jelinek Mercer similarity] . The algorithm attempts to capture important patterns in the text, while leaving out noise. This similarity has the following options:
 
 [horizontal]
 `lambda`::  The optimal value depends on both the collection and the query. The optimal value is around `0.1`
-for title queries and `0.7` for long queries. Default to `0.1`.
+for title queries and `0.7` for long queries. Default to `0.1`. When value approaches `0`, documents that match more query terms will be ranked higher than those that match fewer terms.
 
 Type name: `LMJelinekMercer`