query-string-query.asciidoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438
  1. [[query-dsl-query-string-query]]
  2. === Query string query
  3. ++++
  4. <titleabbrev>Query string</titleabbrev>
  5. ++++
  6. A query that uses a query parser in order to parse its content. Here is
  7. an example:
  8. [source,js]
  9. --------------------------------------------------
  10. GET /_search
  11. {
  12. "query": {
  13. "query_string" : {
  14. "default_field" : "content",
  15. "query" : "this AND that OR thus"
  16. }
  17. }
  18. }
  19. --------------------------------------------------
  20. // CONSOLE
  21. The `query_string` query parses the input and splits text around operators.
  22. Each textual part is analyzed independently of each other. For instance the following query:
  23. [source,js]
  24. --------------------------------------------------
  25. GET /_search
  26. {
  27. "query": {
  28. "query_string" : {
  29. "default_field" : "content",
  30. "query" : "(new york city) OR (big apple)" <1>
  31. }
  32. }
  33. }
  34. --------------------------------------------------
  35. // CONSOLE
  36. <1> will be split into `new york city` and `big apple` and each part is then
  37. analyzed independently by the analyzer configured for the field.
  38. WARNING: Whitespaces are not considered operators, this means that `new york city`
  39. will be passed "as is" to the analyzer configured for the field. If the field is a `keyword`
  40. field the analyzer will create a single term `new york city` and the query builder will
  41. use this term in the query. If you want to query each term separately you need to add explicit
  42. operators around the terms (e.g. `new AND york AND city`).
  43. When multiple fields are provided it is also possible to modify how the different
  44. field queries are combined inside each textual part using the `type` parameter.
  45. The possible modes are described <<multi-match-types, here>> and the default is `best_fields`.
  46. The `query_string` top level parameters include:
  47. [cols="<,<",options="header",]
  48. |=======================================================================
  49. |Parameter |Description
  50. |`query` |The actual query to be parsed. See <<query-string-syntax>>.
  51. |`default_field` |The default field for query terms if no prefix field is
  52. specified. Defaults to the `index.query.default_field` index settings, which in
  53. turn defaults to `*`. `*` extracts all fields in the mapping that are eligible
  54. to term queries and filters the metadata fields. All extracted fields are then
  55. combined to build a query when no prefix field is provided.
  56. WARNING: There is a limit on the number of fields that can be queried
  57. at once. It is defined by the `indices.query.bool.max_clause_count` <<search-settings>>
  58. which defaults to 1024.
  59. |`default_operator` |The default operator used if no explicit operator
  60. is specified. For example, with a default operator of `OR`, the query
  61. `capital of Hungary` is translated to `capital OR of OR Hungary`, and
  62. with default operator of `AND`, the same query is translated to
  63. `capital AND of AND Hungary`. The default value is `OR`.
  64. |`analyzer` |The analyzer name used to analyze the query string.
  65. |`quote_analyzer` |The name of the analyzer that is used to analyze
  66. quoted phrases in the query string. For those parts, it overrides other
  67. analyzers that are set using the `analyzer` parameter or the
  68. <<search-quote-analyzer,`search_quote_analyzer`>> setting.
  69. |`allow_leading_wildcard` |When set, `*` or `?` are allowed as the first
  70. character. Defaults to `true`.
  71. |`enable_position_increments` |Set to `true` to enable position
  72. increments in result queries. Defaults to `true`.
  73. |`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
  74. expand to. Defaults to `50`
  75. |`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
  76. to `AUTO`. See <<fuzziness>> for allowed settings.
  77. |`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
  78. is `0`.
  79. |`fuzzy_transpositions` |Set to `false` to disable fuzzy transpositions (`ab` -> `ba`).
  80. Default is `true`.
  81. |`phrase_slop` |Sets the default slop for phrases. If zero, then exact
  82. phrase matches are required. Default value is `0`.
  83. |`boost` |Sets the boost value of the query. Defaults to `1.0`.
  84. |`analyze_wildcard` |By default, wildcards terms in a query string are
  85. not analyzed. By setting this value to `true`, a best effort will be
  86. made to analyze those as well.
  87. |`max_determinized_states` |Limit on how many automaton states regexp
  88. queries are allowed to create. This protects against too-difficult
  89. (e.g. exponentially hard) regexps. Defaults to 10000.
  90. |`minimum_should_match` |A value controlling how many "should" clauses
  91. in the resulting boolean query should match. It can be an absolute value
  92. (`2`), a percentage (`30%`) or a
  93. <<query-dsl-minimum-should-match,combination of
  94. both>>.
  95. |`lenient` |If set to `true` will cause format based failures (like
  96. providing text to a numeric field) to be ignored.
  97. |`time_zone` | Time Zone to be applied to any range query related to dates.
  98. |`quote_field_suffix` | A suffix to append to fields for quoted parts of
  99. the query string. This allows to use a field that has a different analysis chain
  100. for exact matching. Look <<mixing-exact-search-with-stemming,here>> for a
  101. comprehensive example.
  102. |`auto_generate_synonyms_phrase_query` |Whether phrase queries should be automatically generated for multi terms synonyms.
  103. Defaults to `true`.
  104. |=======================================================================
  105. When a multi term query is being generated, one can control how it gets
  106. rewritten using the
  107. <<query-dsl-multi-term-rewrite,rewrite>>
  108. parameter.
  109. [float]
  110. ==== Default Field
  111. When not explicitly specifying the field to search on in the query
  112. string syntax, the `index.query.default_field` will be used to derive
  113. which field to search on. If the `index.query.default_field` is not specified,
  114. the `query_string` will automatically attempt to determine the existing fields in the index's
  115. mapping that are queryable, and perform the search on those fields.
  116. This will not include nested documents, use a nested query to search those documents.
  117. NOTE: For mappings with a large number of fields, searching across all queryable
  118. fields in the mapping could be expensive.
  119. [float]
  120. ==== Multi Field
  121. The `query_string` query can also run against multiple fields. Fields can be
  122. provided via the `fields` parameter (example below).
  123. The idea of running the `query_string` query against multiple fields is to
  124. expand each query term to an OR clause like this:
  125. field1:query_term OR field2:query_term | ...
  126. For example, the following query
  127. [source,js]
  128. --------------------------------------------------
  129. GET /_search
  130. {
  131. "query": {
  132. "query_string" : {
  133. "fields" : ["content", "name"],
  134. "query" : "this AND that"
  135. }
  136. }
  137. }
  138. --------------------------------------------------
  139. // CONSOLE
  140. matches the same words as
  141. [source,js]
  142. --------------------------------------------------
  143. GET /_search
  144. {
  145. "query": {
  146. "query_string": {
  147. "query": "(content:this OR name:this) AND (content:that OR name:that)"
  148. }
  149. }
  150. }
  151. --------------------------------------------------
  152. // CONSOLE
  153. Since several queries are generated from the individual search terms,
  154. combining them is automatically done using a `dis_max` query with a `tie_breaker`.
  155. For example (the `name` is boosted by 5 using `^5` notation):
  156. [source,js]
  157. --------------------------------------------------
  158. GET /_search
  159. {
  160. "query": {
  161. "query_string" : {
  162. "fields" : ["content", "name^5"],
  163. "query" : "this AND that OR thus",
  164. "tie_breaker" : 0
  165. }
  166. }
  167. }
  168. --------------------------------------------------
  169. // CONSOLE
  170. Simple wildcard can also be used to search "within" specific inner
  171. elements of the document. For example, if we have a `city` object with
  172. several fields (or inner object with fields) in it, we can automatically
  173. search on all "city" fields:
  174. [source,js]
  175. --------------------------------------------------
  176. GET /_search
  177. {
  178. "query": {
  179. "query_string" : {
  180. "fields" : ["city.*"],
  181. "query" : "this AND that OR thus"
  182. }
  183. }
  184. }
  185. --------------------------------------------------
  186. // CONSOLE
  187. Another option is to provide the wildcard fields search in the query
  188. string itself (properly escaping the `*` sign), for example:
  189. `city.\*:something`:
  190. [source,js]
  191. --------------------------------------------------
  192. GET /_search
  193. {
  194. "query": {
  195. "query_string" : {
  196. "query" : "city.\\*:(this AND that OR thus)"
  197. }
  198. }
  199. }
  200. --------------------------------------------------
  201. // CONSOLE
  202. NOTE: Since `\` (backslash) is a special character in json strings, it needs to
  203. be escaped, hence the two backslashes in the above `query_string`.
  204. When running the `query_string` query against multiple fields, the
  205. following additional parameters are allowed:
  206. [cols="<,<",options="header",]
  207. |=======================================================================
  208. |Parameter |Description
  209. |`type` |How the fields should be combined to build the text query.
  210. See <<multi-match-types, types>> for a complete example.
  211. Defaults to `best_fields`
  212. |`tie_breaker` |The disjunction max tie breaker for multi fields.
  213. Defaults to `0`
  214. |=======================================================================
  215. The fields parameter can also include pattern based field names,
  216. allowing to automatically expand to the relevant fields (dynamically
  217. introduced fields included). For example:
  218. [source,js]
  219. --------------------------------------------------
  220. GET /_search
  221. {
  222. "query": {
  223. "query_string" : {
  224. "fields" : ["content", "name.*^5"],
  225. "query" : "this AND that OR thus"
  226. }
  227. }
  228. }
  229. --------------------------------------------------
  230. // CONSOLE
  231. [float]
  232. ==== Synonyms
  233. The `query_string` query supports multi-terms synonym expansion with the <<analysis-synonym-graph-tokenfilter,
  234. synonym_graph>> token filter. When this filter is used, the parser creates a phrase query for each multi-terms synonyms.
  235. For example, the following synonym: `ny, new york` would produce:
  236. `(ny OR ("new york"))`
  237. It is also possible to match multi terms synonyms with conjunctions instead:
  238. [source,js]
  239. --------------------------------------------------
  240. GET /_search
  241. {
  242. "query": {
  243. "query_string" : {
  244. "default_field": "title",
  245. "query" : "ny city",
  246. "auto_generate_synonyms_phrase_query" : false
  247. }
  248. }
  249. }
  250. --------------------------------------------------
  251. // CONSOLE
  252. The example above creates a boolean query:
  253. `(ny OR (new AND york)) city`
  254. that matches documents with the term `ny` or the conjunction `new AND york`.
  255. By default the parameter `auto_generate_synonyms_phrase_query` is set to `true`.
  256. [float]
  257. ==== Minimum should match
  258. The `query_string` splits the query around each operator to create a boolean
  259. query for the entire input. You can use `minimum_should_match` to control how
  260. many "should" clauses in the resulting query should match.
  261. [source,js]
  262. --------------------------------------------------
  263. GET /_search
  264. {
  265. "query": {
  266. "query_string": {
  267. "fields": [
  268. "title"
  269. ],
  270. "query": "this that thus",
  271. "minimum_should_match": 2
  272. }
  273. }
  274. }
  275. --------------------------------------------------
  276. // CONSOLE
  277. The example above creates a boolean query:
  278. `(title:this title:that title:thus)~2`
  279. that matches documents with at least two of the terms `this`, `that` or `thus`
  280. in the single field `title`.
  281. [float]
  282. ===== Multi Field
  283. [source,js]
  284. --------------------------------------------------
  285. GET /_search
  286. {
  287. "query": {
  288. "query_string": {
  289. "fields": [
  290. "title",
  291. "content"
  292. ],
  293. "query": "this that thus",
  294. "minimum_should_match": 2
  295. }
  296. }
  297. }
  298. --------------------------------------------------
  299. // CONSOLE
  300. The example above creates a boolean query:
  301. `((content:this content:that content:thus) | (title:this title:that title:thus))`
  302. that matches documents with the disjunction max over the fields `title` and
  303. `content`. Here the `minimum_should_match` parameter can't be applied.
  304. [source,js]
  305. --------------------------------------------------
  306. GET /_search
  307. {
  308. "query": {
  309. "query_string": {
  310. "fields": [
  311. "title",
  312. "content"
  313. ],
  314. "query": "this OR that OR thus",
  315. "minimum_should_match": 2
  316. }
  317. }
  318. }
  319. --------------------------------------------------
  320. // CONSOLE
  321. Adding explicit operators forces each term to be considered as a separate clause.
  322. The example above creates a boolean query:
  323. `((content:this | title:this) (content:that | title:that) (content:thus | title:thus))~2`
  324. that matches documents with at least two of the three "should" clauses, each of
  325. them made of the disjunction max over the fields for each term.
  326. [float]
  327. ===== Cross Field
  328. [source,js]
  329. --------------------------------------------------
  330. GET /_search
  331. {
  332. "query": {
  333. "query_string": {
  334. "fields": [
  335. "title",
  336. "content"
  337. ],
  338. "query": "this OR that OR thus",
  339. "type": "cross_fields",
  340. "minimum_should_match": 2
  341. }
  342. }
  343. }
  344. --------------------------------------------------
  345. // CONSOLE
  346. The `cross_fields` value in the `type` field indicates that fields that have the
  347. same analyzer should be grouped together when the input is analyzed.
  348. The example above creates a boolean query:
  349. `(blended(terms:[field2:this, field1:this]) blended(terms:[field2:that, field1:that]) blended(terms:[field2:thus, field1:thus]))~2`
  350. that matches documents with at least two of the three per-term blended queries.
  351. include::query-string-syntax.asciidoc[]