match-query.asciidoc 7.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237
  1. [[query-dsl-match-query]]
  2. === Match Query
  3. A family of `match` queries that accept text/numerics/dates, analyzes
  4. it, and constructs a query out of it. For example:
  5. [source,js]
  6. --------------------------------------------------
  7. {
  8. "match" : {
  9. "message" : "this is a test"
  10. }
  11. }
  12. --------------------------------------------------
  13. Note, `message` is the name of a field, you can substitute the name of
  14. any field (including `_all`) instead.
  15. There are three types of `match` query: `boolean`, `phrase`, and `phrase_prefix`:
  16. [[query-dsl-match-query-boolean]]
  17. ==== boolean
  18. The default `match` query is of type `boolean`. It means that the text
  19. provided is analyzed and the analysis process constructs a boolean query
  20. from the provided text. The `operator` flag can be set to `or` or `and`
  21. to control the boolean clauses (defaults to `or`). The minimum number of
  22. optional `should` clauses to match can be set using the
  23. <<query-dsl-minimum-should-match,`minimum_should_match`>>
  24. parameter.
  25. The `analyzer` can be set to control which analyzer will perform the
  26. analysis process on the text. It defaults to the field explicit mapping
  27. definition, or the default search analyzer.
  28. The `lenient` parameter can be set to `true` to ignore exceptions caused by
  29. data-type mismatches, such as trying to query a numeric field with a text
  30. query string. Defaults to `false`.
  31. [[query-dsl-match-query-fuzziness]]
  32. ====== Fuzziness
  33. `fuzziness` allows _fuzzy matching_ based on the type of field being queried.
  34. See <<fuzziness>> for allowed settings.
  35. The `prefix_length` and
  36. `max_expansions` can be set in this case to control the fuzzy process.
  37. If the fuzzy option is set the query will use `constant_score_rewrite`
  38. as its <<query-dsl-multi-term-rewrite,rewrite
  39. method>> the `fuzzy_rewrite` parameter allows to control how the query will get
  40. rewritten.
  41. Here is an example when providing additional parameters (note the slight
  42. change in structure, `message` is the field name):
  43. [source,js]
  44. --------------------------------------------------
  45. {
  46. "match" : {
  47. "message" : {
  48. "query" : "this is a test",
  49. "operator" : "and"
  50. }
  51. }
  52. }
  53. --------------------------------------------------
  54. [[query-dsl-match-query-zero]]
  55. ===== Zero terms query
  56. If the analyzer used removes all tokens in a query like a `stop` filter
  57. does, the default behavior is to match no documents at all. In order to
  58. change that the `zero_terms_query` option can be used, which accepts
  59. `none` (default) and `all` which corresponds to a `match_all` query.
  60. [source,js]
  61. --------------------------------------------------
  62. {
  63. "match" : {
  64. "message" : {
  65. "query" : "to be or not to be",
  66. "operator" : "and",
  67. "zero_terms_query": "all"
  68. }
  69. }
  70. }
  71. --------------------------------------------------
  72. [[query-dsl-match-query-cutoff]]
  73. ===== Cutoff frequency
  74. The match query supports a `cutoff_frequency` that allows
  75. specifying an absolute or relative document frequency where high
  76. frequency terms are moved into an optional subquery and are only scored
  77. if one of the low frequency (below the cutoff) terms in the case of an
  78. `or` operator or all of the low frequency terms in the case of an `and`
  79. operator match.
  80. This query allows handling `stopwords` dynamically at runtime, is domain
  81. independent and doesn't require a stopword file. It prevents scoring /
  82. iterating high frequency terms and only takes the terms into account if a
  83. more significant / lower frequency term matches a document. Yet, if all
  84. of the query terms are above the given `cutoff_frequency` the query is
  85. automatically transformed into a pure conjunction (`and`) query to
  86. ensure fast execution.
  87. The `cutoff_frequency` can either be relative to the total number of
  88. documents if in the range `[0..1)` or absolute if greater or equal to
  89. `1.0`.
  90. Here is an example showing a query composed of stopwords exclusively:
  91. [source,js]
  92. --------------------------------------------------
  93. {
  94. "match" : {
  95. "message" : {
  96. "query" : "to be or not to be",
  97. "cutoff_frequency" : 0.001
  98. }
  99. }
  100. }
  101. --------------------------------------------------
  102. IMPORTANT: The `cutoff_frequency` option operates on a per-shard-level. This means
  103. that when trying it out on test indexes with low document numbers you
  104. should follow the advice in {defguide}/relevance-is-broken.html[Relevance is broken].
  105. [[query-dsl-match-query-phrase]]
  106. ==== phrase
  107. The `match_phrase` query analyzes the text and creates a `phrase` query
  108. out of the analyzed text. For example:
  109. [source,js]
  110. --------------------------------------------------
  111. {
  112. "match_phrase" : {
  113. "message" : "this is a test"
  114. }
  115. }
  116. --------------------------------------------------
  117. Since `match_phrase` is only a `type` of a `match` query, it can also be
  118. used in the following manner:
  119. [source,js]
  120. --------------------------------------------------
  121. {
  122. "match" : {
  123. "message" : {
  124. "query" : "this is a test",
  125. "type" : "phrase"
  126. }
  127. }
  128. }
  129. --------------------------------------------------
  130. A phrase query matches terms up to a configurable `slop`
  131. (which defaults to 0) in any order. Transposed terms have a slop of 2.
  132. The `analyzer` can be set to control which analyzer will perform the
  133. analysis process on the text. It default to the field explicit mapping
  134. definition, or the default search analyzer, for example:
  135. [source,js]
  136. --------------------------------------------------
  137. {
  138. "match_phrase" : {
  139. "message" : {
  140. "query" : "this is a test",
  141. "analyzer" : "my_analyzer"
  142. }
  143. }
  144. }
  145. --------------------------------------------------
  146. [[query-dsl-match-query-phrase-prefix]]
  147. ==== match_phrase_prefix
  148. The `match_phrase_prefix` is the same as `match_phrase`, except that it
  149. allows for prefix matches on the last term in the text. For example:
  150. [source,js]
  151. --------------------------------------------------
  152. {
  153. "match_phrase_prefix" : {
  154. "message" : "this is a test"
  155. }
  156. }
  157. --------------------------------------------------
  158. Or:
  159. [source,js]
  160. --------------------------------------------------
  161. {
  162. "match" : {
  163. "message" : {
  164. "query" : "this is a test",
  165. "type" : "phrase_prefix"
  166. }
  167. }
  168. }
  169. --------------------------------------------------
  170. It accepts the same parameters as the phrase type. In addition, it also
  171. accepts a `max_expansions` parameter that can control to how many
  172. prefixes the last term will be expanded. It is highly recommended to set
  173. it to an acceptable value to control the execution time of the query.
  174. For example:
  175. [source,js]
  176. --------------------------------------------------
  177. {
  178. "match_phrase_prefix" : {
  179. "message" : {
  180. "query" : "this is a test",
  181. "max_expansions" : 10
  182. }
  183. }
  184. }
  185. --------------------------------------------------
  186. .Comparison to query_string / field
  187. **************************************************
  188. The match family of queries does not go through a "query parsing"
  189. process. It does not support field name prefixes, wildcard characters,
  190. or other "advanced" features. For this reason, chances of it failing are
  191. very small / non existent, and it provides an excellent behavior when it
  192. comes to just analyze and run that text as a query behavior (which is
  193. usually what a text search box does). Also, the `phrase_prefix` type can
  194. provide a great "as you type" behavior to automatically load search
  195. results.
  196. **************************************************