similarity.asciidoc 1.9 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
  1. [[similarity]]
  2. === `similarity`
  3. Elasticsearch allows you to configure a scoring algorithm or _similarity_ per
  4. field. The `similarity` setting provides a simple way of choosing a similarity
  5. algorithm other than the default `BM25`, such as `TF/IDF`.
  6. Similarities are mostly useful for <<text,`text`>> fields, but can also apply
  7. to other field types.
  8. Custom similarities can be configured by tuning the parameters of the built-in
  9. similarities. For more details about this expert options, see the
  10. <<index-modules-similarity,similarity module>>.
  11. The only similarities which can be used out of the box, without any further
  12. configuration are:
  13. `BM25`::
  14. The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene.
  15. See {defguide}/pluggable-similarites.html[Pluggable Similarity Algorithms]
  16. for more information.
  17. `classic`::
  18. The TF/IDF algorithm which used to be the default in Elasticsearch and
  19. Lucene. See {defguide}/practical-scoring-function.html[Lucene’s Practical Scoring Function]
  20. for more information.
  21. `boolean`::
  22. A simple boolean similarity, which is used when full-text ranking is not needed
  23. and the score should only be based on whether the query terms match or not.
  24. Boolean similarity gives terms a score equal to their query boost.
  25. The `similarity` can be set on the field level when a field is first created,
  26. as follows:
  27. [source,js]
  28. --------------------------------------------------
  29. PUT my_index
  30. {
  31. "mappings": {
  32. "_doc": {
  33. "properties": {
  34. "default_field": { <1>
  35. "type": "text"
  36. },
  37. "boolean_sim_field": {
  38. "type": "text",
  39. "similarity": "boolean" <2>
  40. }
  41. }
  42. }
  43. }
  44. }
  45. --------------------------------------------------
  46. // CONSOLE
  47. <1> The `default_field` uses the `BM25` similarity.
  48. <2> The `boolean_sim_field` uses the `boolean` similarity.