completion-suggest.asciidoc 7.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231
  1. [[search-suggesters-completion]]
  2. === Completion Suggester
  3. NOTE: In order to understand the format of suggestions, please
  4. read the <<search-suggesters>> page first.
  5. The `completion` suggester is a so-called prefix suggester. It does not
  6. do spell correction like the `term` or `phrase` suggesters but allows
  7. basic `auto-complete` functionality.
  8. ==== Why another suggester? Why not prefix queries?
  9. The first question which comes to mind when reading about a prefix
  10. suggestion is, why you should use it all, if you have prefix queries
  11. already. The answer is simple: Prefix suggestions are fast.
  12. The data structures are internally backed by Lucenes
  13. `AnalyzingSuggester`, which uses FSTs to execute suggestions. Usually
  14. these data structures are costly to create, stored in-memory and need to
  15. be rebuilt every now and then to reflect changes in your indexed
  16. documents. The `completion` suggester circumvents this by storing the
  17. FST as part of your index during index time. This allows for really fast
  18. loads and executions.
  19. [[completion-suggester-mapping]]
  20. ==== Mapping
  21. In order to use this feature, you have to specify a special mapping for
  22. this field, which enables the special storage of the field.
  23. [source,js]
  24. --------------------------------------------------
  25. curl -X PUT localhost:9200/music
  26. curl -X PUT localhost:9200/music/song/_mapping -d '{
  27. "song" : {
  28. "properties" : {
  29. "name" : { "type" : "string" },
  30. "suggest" : { "type" : "completion",
  31. "index_analyzer" : "simple",
  32. "search_analyzer" : "simple",
  33. "payloads" : true
  34. }
  35. }
  36. }
  37. }'
  38. --------------------------------------------------
  39. Mapping supports the following parameters:
  40. `index_analyzer`::
  41. The index analyzer to use, defaults to `simple`.
  42. `search_analyzer`::
  43. The search analyzer to use, defaults to `simple`.
  44. In case you are wondering why we did not opt for the `standard`
  45. analyzer: We try to have easy to understand behaviour here, and if you
  46. index the field content `At the Drive-in`, you will not get any
  47. suggestions for `a`, nor for `d` (the first non stopword).
  48. `payloads`::
  49. Enables the storing of payloads, defaults to `false`
  50. `preserve_separators`::
  51. Preserves the separators, defaults to `true`.
  52. If disabled, you could find a field starting with `Foo Fighters`, if you
  53. suggest for `foof`.
  54. `preserve_position_increments`::
  55. Enables position increments, defaults
  56. to `true`. If disabled and using stopwords analyzer, you could get a
  57. field starting with `The Beatles`, if you suggest for `b`. *Note*: You
  58. could also achieve this by indexing two inputs, `Beatles` and
  59. `The Beatles`, no need to change a simple analyzer, if you are able to
  60. enrich your data.
  61. `max_input_length`::
  62. Limits the length of a single input, defaults to `50` UTF-16 code points.
  63. This limit is only used at index time to reduce the total number of
  64. characters per input string in order to prevent massive inputs from
  65. bloating the underlying datastructure. The most usecases won't be influenced
  66. by the default value since prefix completions hardly grow beyond prefixes longer
  67. than a handful of characters. (Old name "max_input_len" is deprecated)
  68. [[indexing]]
  69. ==== Indexing
  70. [source,js]
  71. --------------------------------------------------
  72. curl -X PUT 'localhost:9200/music/song/1?refresh=true' -d '{
  73. "name" : "Nevermind",
  74. "suggest" : {
  75. "input": [ "Nevermind", "Nirvana" ],
  76. "output": "Nirvana - Nevermind",
  77. "payload" : { "artistId" : 2321 },
  78. "weight" : 34
  79. }
  80. }'
  81. --------------------------------------------------
  82. The following parameters are supported:
  83. `input`::
  84. The input to store, this can be a an array of strings or just
  85. a string. This field is mandatory.
  86. `output`::
  87. The string to return, if a suggestion matches. This is very
  88. useful to normalize outputs (i.e. have them always in the format
  89. `artist - songname`). The result is de-duplicated if several documents
  90. have the same output, i.e. only one is returned as part of the
  91. suggest result. This is optional.
  92. `payload`::
  93. An arbitrary JSON object, which is simply returned in the
  94. suggest option. You could store data like the id of a document, in order
  95. to load it from elasticsearch without executing another search (which
  96. might not yield any results, if `input` and `output` differ strongly).
  97. `weight`::
  98. A positive integer, which defines a weight and allows you to
  99. rank your suggestions. This field is optional.
  100. NOTE: Even though you are losing most of the features of the
  101. completion suggest, you can opt in for the shortest form, which even
  102. allows you to use inside of multi fields. But keep in mind, that you will
  103. not be able to use several inputs, an output, payloads or weights.
  104. [source,js]
  105. --------------------------------------------------
  106. {
  107. "suggest" : "Nirvana"
  108. }
  109. --------------------------------------------------
  110. NOTE: The suggest data structure might not reflect deletes on
  111. documents immediately. You may need to do an <<indices-optimize>> for that.
  112. You can call optimize with the `only_expunge_deletes=true` to only cater for deletes
  113. or alternatively call a <<index-modules-merge>> operation.
  114. [[querying]]
  115. ==== Querying
  116. Suggesting works as usual, except that you have to specify the suggest
  117. type as `completion`.
  118. [source,js]
  119. --------------------------------------------------
  120. curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
  121. "song-suggest" : {
  122. "text" : "n",
  123. "completion" : {
  124. "field" : "suggest"
  125. }
  126. }
  127. }'
  128. {
  129. "_shards" : {
  130. "total" : 5,
  131. "successful" : 5,
  132. "failed" : 0
  133. },
  134. "song-suggest" : [ {
  135. "text" : "n",
  136. "offset" : 0,
  137. "length" : 4,
  138. "options" : [ {
  139. "text" : "Nirvana - Nevermind",
  140. "score" : 34.0, "payload" : {"artistId":2321}
  141. } ]
  142. } ]
  143. }
  144. --------------------------------------------------
  145. As you can see, the payload is included in the response, if configured
  146. appropriately. If you configured a weight for a suggestion, this weight
  147. is used as `score`. Also the `text` field uses the `output` of your
  148. indexed suggestion, if configured, otherwise the matched part of the
  149. `input` field.
  150. [[fuzzy]]
  151. ==== Fuzzy queries
  152. The completion suggester also supports fuzzy queries - this means,
  153. you can actually have a typo in your search and still get results back.
  154. [source,js]
  155. --------------------------------------------------
  156. curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
  157. "song-suggest" : {
  158. "text" : "n",
  159. "completion" : {
  160. "field" : "suggest",
  161. "fuzzy" : {
  162. "fuzziness" : 2
  163. }
  164. }
  165. }
  166. }'
  167. --------------------------------------------------
  168. The fuzzy query can take specific fuzzy parameters.
  169. The following parameters are supported:
  170. [horizontal]
  171. `fuzziness`::
  172. The fuzziness factor, defaults to `AUTO`.
  173. See <<fuzziness>> for allowed settings.
  174. `transpositions`::
  175. Sets if transpositions should be counted
  176. as one or two changes, defaults to `true`
  177. `min_length`::
  178. Minimum length of the input before fuzzy
  179. suggestions are returned, defaults `3`
  180. `prefix_length`::
  181. Minimum length of the input, which is not
  182. checked for fuzzy alternatives, defaults to `1`
  183. `unicode_aware`::
  184. Sets all are measurements (like edit distance,
  185. transpositions and lengths) in unicode code points
  186. (actual letters) instead of bytes.
  187. NOTE: If you want to stick with the default values, but
  188. still use fuzzy, you can either use `fuzzy: {}`
  189. or `fuzzy: true`.