123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220 |
- [[query-dsl-match-query]]
- === Match Query
- A family of `match` queries that accept text/numerics/dates, analyzes
- it, and constructs a query out of it. For example:
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : "this is a test"
- }
- }
- --------------------------------------------------
- Note, `message` is the name of a field, you can substitute the name of
- any field (including `_all`) instead.
- [float]
- ==== Types of Match Queries
- [float]
- ===== boolean
- The default `match` query is of type `boolean`. It means that the text
- provided is analyzed and the analysis process constructs a boolean query
- from the provided text. The `operator` flag can be set to `or` or `and`
- to control the boolean clauses (defaults to `or`). The minimum number of
- should clauses to match can be set using the
- <<query-dsl-minimum-should-match,`minimum_should_match`>>
- parameter.
- The `analyzer` can be set to control which analyzer will perform the
- analysis process on the text. It default to the field explicit mapping
- definition, or the default search analyzer.
- `fuzziness` can be set to a value (depending on the relevant type, for
- string types it should be a value between `0.0` and `1.0`) to constructs
- fuzzy queries for each term analyzed. The `prefix_length` and
- `max_expansions` can be set in this case to control the fuzzy process.
- If the fuzzy option is set the query will use `constant_score_rewrite`
- as its <<query-dsl-multi-term-rewrite,rewrite
- method>> the `rewrite` parameter allows to control how the query will get
- rewritten.
- Here is an example when providing additional parameters (note the slight
- change in structure, `message` is the field name):
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : {
- "query" : "this is a test",
- "operator" : "and"
- }
- }
- }
- --------------------------------------------------
- .zero_terms_query
- If the analyzer used removes all tokens in a query like a `stop` filter
- does, the default behavior is to match no documents at all. In order to
- change that the `zero_terms_query` option can be used, which accepts
- `none` (default) and `all` which corresponds to a `match_all` query.
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : {
- "query" : "to be or not to be",
- "operator" : "and",
- "zero_terms_query": "all"
- }
- }
- }
- --------------------------------------------------
- .cutoff_frequency
- The match query supports a `cutoff_frequency` that allows
- specifying an absolute or relative document frequency where high
- frequent terms are moved into an optional subquery and are only scored
- if one of the low frequent (below the cutoff) terms in the case of an
- `or` operator or all of the low frequent terms in the case of an `and`
- operator match.
- This query allows handling `stopwords` dynamically at runtime, is domain
- independent and doesn't require on a stopword file. It prevent scoring /
- iterating high frequent terms and only takes the terms into account if a
- more significant / lower frequent terms match a document. Yet, if all of
- the query terms are above the given `cutoff_frequency` the query is
- automatically transformed into a pure conjunction (`and`) query to
- ensure fast execution.
- The `cutoff_frequency` can either be relative to the number of documents
- in the index if in the range `[0..1)` or absolute if greater or equal to
- `1.0`.
- Here is an example showing a query composed of stopwords exclusivly:
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : {
- "query" : "to be or not to be",
- "cutoff_frequency" : 0.001
- }
- }
- }
- --------------------------------------------------
- [float]
- ===== phrase
- The `match_phrase` query analyzes the text and creates a `phrase` query
- out of the analyzed text. For example:
- [source,js]
- --------------------------------------------------
- {
- "match_phrase" : {
- "message" : "this is a test"
- }
- }
- --------------------------------------------------
- Since `match_phrase` is only a `type` of a `match` query, it can also be
- used in the following manner:
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : {
- "query" : "this is a test",
- "type" : "phrase"
- }
- }
- }
- --------------------------------------------------
- A phrase query maintains order of the terms up to a configurable `slop`
- (which defaults to 0).
- The `analyzer` can be set to control which analyzer will perform the
- analysis process on the text. It default to the field explicit mapping
- definition, or the default search analyzer, for example:
- [source,js]
- --------------------------------------------------
- {
- "match_phrase" : {
- "message" : {
- "query" : "this is a test",
- "analyzer" : "my_analyzer"
- }
- }
- }
- --------------------------------------------------
- [float]
- ===== match_phrase_prefix
- The `match_phrase_prefix` is the same as `match_phrase`, except that it
- allows for prefix matches on the last term in the text. For example:
- [source,js]
- --------------------------------------------------
- {
- "match_phrase_prefix" : {
- "message" : "this is a test"
- }
- }
- --------------------------------------------------
- Or:
- [source,js]
- --------------------------------------------------
- {
- "match" : {
- "message" : {
- "query" : "this is a test",
- "type" : "phrase_prefix"
- }
- }
- }
- --------------------------------------------------
- It accepts the same parameters as the phrase type. In addition, it also
- accepts a `max_expansions` parameter that can control to how many
- prefixes the last term will be expanded. It is highly recommended to set
- it to an acceptable value to control the execution time of the query.
- For example:
- [source,js]
- --------------------------------------------------
- {
- "match_phrase_prefix" : {
- "message" : {
- "query" : "this is a test",
- "max_expansions" : 10
- }
- }
- }
- --------------------------------------------------
- [float]
- ==== Comparison to query_string / field
- The match family of queries does not go through a "query parsing"
- process. It does not support field name prefixes, wildcard characters,
- or other "advance" features. For this reason, chances of it failing are
- very small / non existent, and it provides an excellent behavior when it
- comes to just analyze and run that text as a query behavior (which is
- usually what a text search box does). Also, the `phrase_prefix` type can
- provide a great "as you type" behavior to automatically load search
- results.
|