123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231 |
- [[search-suggesters-completion]]
- === Completion Suggester
- NOTE: In order to understand the format of suggestions, please
- read the <<search-suggesters>> page first.
- The `completion` suggester is a so-called prefix suggester. It does not
- do spell correction like the `term` or `phrase` suggesters but allows
- basic `auto-complete` functionality.
- ==== Why another suggester? Why not prefix queries?
- The first question which comes to mind when reading about a prefix
- suggestion is, why you should use it all, if you have prefix queries
- already. The answer is simple: Prefix suggestions are fast.
- The data structures are internally backed by Lucenes
- `AnalyzingSuggester`, which uses FSTs to execute suggestions. Usually
- these data structures are costly to create, stored in-memory and need to
- be rebuilt every now and then to reflect changes in your indexed
- documents. The `completion` suggester circumvents this by storing the
- FST as part of your index during index time. This allows for really fast
- loads and executions.
- [[completion-suggester-mapping]]
- ==== Mapping
- In order to use this feature, you have to specify a special mapping for
- this field, which enables the special storage of the field.
- [source,js]
- --------------------------------------------------
- curl -X PUT localhost:9200/music
- curl -X PUT localhost:9200/music/song/_mapping -d '{
- "song" : {
- "properties" : {
- "name" : { "type" : "string" },
- "suggest" : { "type" : "completion",
- "index_analyzer" : "simple",
- "search_analyzer" : "simple",
- "payloads" : true
- }
- }
- }
- }'
- --------------------------------------------------
- Mapping supports the following parameters:
- `index_analyzer`::
- The index analyzer to use, defaults to `simple`.
- `search_analyzer`::
- The search analyzer to use, defaults to `simple`.
- In case you are wondering why we did not opt for the `standard`
- analyzer: We try to have easy to understand behaviour here, and if you
- index the field content `At the Drive-in`, you will not get any
- suggestions for `a`, nor for `d` (the first non stopword).
- `payloads`::
- Enables the storing of payloads, defaults to `false`
- `preserve_separators`::
- Preserves the separators, defaults to `true`.
- If disabled, you could find a field starting with `Foo Fighters`, if you
- suggest for `foof`.
- `preserve_position_increments`::
- Enables position increments, defaults
- to `true`. If disabled and using stopwords analyzer, you could get a
- field starting with `The Beatles`, if you suggest for `b`. *Note*: You
- could also achieve this by indexing two inputs, `Beatles` and
- `The Beatles`, no need to change a simple analyzer, if you are able to
- enrich your data.
- `max_input_length`::
- Limits the length of a single input, defaults to `50` UTF-16 code points.
- This limit is only used at index time to reduce the total number of
- characters per input string in order to prevent massive inputs from
- bloating the underlying datastructure. The most usecases won't be influenced
- by the default value since prefix completions hardly grow beyond prefixes longer
- than a handful of characters. (Old name "max_input_len" is deprecated)
- [[indexing]]
- ==== Indexing
- [source,js]
- --------------------------------------------------
- curl -X PUT 'localhost:9200/music/song/1?refresh=true' -d '{
- "name" : "Nevermind",
- "suggest" : {
- "input": [ "Nevermind", "Nirvana" ],
- "output": "Nirvana - Nevermind",
- "payload" : { "artistId" : 2321 },
- "weight" : 34
- }
- }'
- --------------------------------------------------
- The following parameters are supported:
- `input`::
- The input to store, this can be a an array of strings or just
- a string. This field is mandatory.
- `output`::
- The string to return, if a suggestion matches. This is very
- useful to normalize outputs (i.e. have them always in the format
- `artist - songname`). The result is de-duplicated if several documents
- have the same output, i.e. only one is returned as part of the
- suggest result. This is optional.
- `payload`::
- An arbitrary JSON object, which is simply returned in the
- suggest option. You could store data like the id of a document, in order
- to load it from elasticsearch without executing another search (which
- might not yield any results, if `input` and `output` differ strongly).
- `weight`::
- A positive integer, which defines a weight and allows you to
- rank your suggestions. This field is optional.
- NOTE: Even though you are losing most of the features of the
- completion suggest, you can opt in for the shortest form, which even
- allows you to use inside of multi fields. But keep in mind, that you will
- not be able to use several inputs, an output, payloads or weights.
- [source,js]
- --------------------------------------------------
- {
- "suggest" : "Nirvana"
- }
- --------------------------------------------------
- NOTE: The suggest data structure might not reflect deletes on
- documents immediately. You may need to do an <<indices-optimize>> for that.
- You can call optimize with the `only_expunge_deletes=true` to only cater for deletes
- or alternatively call a <<index-modules-merge>> operation.
- [[querying]]
- ==== Querying
- Suggesting works as usual, except that you have to specify the suggest
- type as `completion`.
- [source,js]
- --------------------------------------------------
- curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
- "song-suggest" : {
- "text" : "n",
- "completion" : {
- "field" : "suggest"
- }
- }
- }'
- {
- "_shards" : {
- "total" : 5,
- "successful" : 5,
- "failed" : 0
- },
- "song-suggest" : [ {
- "text" : "n",
- "offset" : 0,
- "length" : 4,
- "options" : [ {
- "text" : "Nirvana - Nevermind",
- "score" : 34.0, "payload" : {"artistId":2321}
- } ]
- } ]
- }
- --------------------------------------------------
- As you can see, the payload is included in the response, if configured
- appropriately. If you configured a weight for a suggestion, this weight
- is used as `score`. Also the `text` field uses the `output` of your
- indexed suggestion, if configured, otherwise the matched part of the
- `input` field.
- [[fuzzy]]
- ==== Fuzzy queries
- The completion suggester also supports fuzzy queries - this means,
- you can actually have a typo in your search and still get results back.
- [source,js]
- --------------------------------------------------
- curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
- "song-suggest" : {
- "text" : "n",
- "completion" : {
- "field" : "suggest",
- "fuzzy" : {
- "fuzziness" : 2
- }
- }
- }
- }'
- --------------------------------------------------
- The fuzzy query can take specific fuzzy parameters.
- The following parameters are supported:
- [horizontal]
- `fuzziness`::
- The fuzziness factor, defaults to `AUTO`.
- See <<fuzziness>> for allowed settings.
- `transpositions`::
- Sets if transpositions should be counted
- as one or two changes, defaults to `true`
- `min_length`::
- Minimum length of the input before fuzzy
- suggestions are returned, defaults `3`
- `prefix_length`::
- Minimum length of the input, which is not
- checked for fuzzy alternatives, defaults to `1`
- `unicode_aware`::
- Sets all are measurements (like edit distance,
- transpositions and lengths) in unicode code points
- (actual letters) instead of bytes.
- NOTE: If you want to stick with the default values, but
- still use fuzzy, you can either use `fuzzy: {}`
- or `fuzzy: true`.
|