12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758 |
- [[similarity]]
- === `similarity`
- Elasticsearch allows you to configure a scoring algorithm or _similarity_ per
- field. The `similarity` setting provides a simple way of choosing a similarity
- algorithm other than the default `BM25`, such as `TF/IDF`.
- Similarities are mostly useful for <<text,`text`>> fields, but can also apply
- to other field types.
- Custom similarities can be configured by tuning the parameters of the built-in
- similarities. For more details about this expert options, see the
- <<index-modules-similarity,similarity module>>.
- The only similarities which can be used out of the box, without any further
- configuration are:
- `BM25`::
- The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene.
- See {defguide}/pluggable-similarites.html[Pluggable Similarity Algorithms]
- for more information.
- `classic`::
- The TF/IDF algorithm which used to be the default in Elasticsearch and
- Lucene. See {defguide}/practical-scoring-function.html[Lucene’s Practical Scoring Function]
- for more information.
- `boolean`::
- A simple boolean similarity, which is used when full-text ranking is not needed
- and the score should only be based on whether the query terms match or not.
- Boolean similarity gives terms a score equal to their query boost.
- The `similarity` can be set on the field level when a field is first created,
- as follows:
- [source,js]
- --------------------------------------------------
- PUT my_index
- {
- "mappings": {
- "_doc": {
- "properties": {
- "default_field": { <1>
- "type": "text"
- },
- "boolean_sim_field": {
- "type": "text",
- "similarity": "boolean" <2>
- }
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- <1> The `default_field` uses the `BM25` similarity.
- <2> The `boolean_sim_field` uses the `boolean` similarity.
|