similarity.asciidoc 1.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
  1. [[similarity]]
  2. === `similarity`
  3. Elasticsearch allows you to configure a scoring algorithm or _similarity_ per
  4. field. The `similarity` setting provides a simple way of choosing a similarity
  5. algorithm other than the default TF/IDF, such as `BM25`.
  6. Similarities are mostly useful for <<text,`text`>> fields, but can also apply
  7. to other field types.
  8. Custom similarities can be configured by tuning the parameters of the built-in
  9. similarities. For more details about this expert options, see the
  10. <<index-modules-similarity,similarity module>>.
  11. The only similarities which can be used out of the box, without any further
  12. configuration are:
  13. `classic`::
  14. The Default TF/IDF algorithm used by Elasticsearch and
  15. Lucene. See {defguide}/practical-scoring-function.html[Lucene’s Practical Scoring Function]
  16. for more information.
  17. `BM25`::
  18. The Okapi BM25 algorithm.
  19. See {defguide}/pluggable-similarites.html[Pluggable Similarity Algorithms]
  20. for more information.
  21. The `similarity` can be set on the field level when a field is first created,
  22. as follows:
  23. [source,js]
  24. --------------------------------------------------
  25. PUT my_index
  26. {
  27. "mappings": {
  28. "my_type": {
  29. "properties": {
  30. "default_field": { <1>
  31. "type": "text"
  32. },
  33. "bm25_field": {
  34. "type": "text",
  35. "similarity": "BM25" <2>
  36. }
  37. }
  38. }
  39. }
  40. }
  41. --------------------------------------------------
  42. // CONSOLE
  43. <1> The `default_field` uses the `classic` similarity (ie TF/IDF).
  44. <2> The `bm25_field` uses the `BM25` similarity.