|  | @@ -0,0 +1,266 @@
 | 
	
		
			
				|  |  | +[[query-string-syntax]]
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +==== Query string syntax
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The query string ``mini-language'' is used by the
 | 
	
		
			
				|  |  | +<<query-dsl-query-string-query>> and <<query-dsl-field-query>>, by the
 | 
	
		
			
				|  |  | +`q` query string parameter in the <<search-search,`search` API>> and
 | 
	
		
			
				|  |  | +by the `percolate` parameter  in the <<docs-index_,`index`>> and
 | 
	
		
			
				|  |  | +<<docs-bulk,`bulk`>> APIs.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The query string is parsed into a series of _terms_ and _operators_. A
 | 
	
		
			
				|  |  | +term can be a single word -- `quick` or `brown` -- or a phrase, surrounded by
 | 
	
		
			
				|  |  | +double quotes -- `"quick brown"` -- which searches for all the words in the
 | 
	
		
			
				|  |  | +phrase, in the same order.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Operators allow you to customize the search -- the available options are
 | 
	
		
			
				|  |  | +explained below.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Field names
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +As mentioned in <<query-dsl-query-string-query>>, the `default_field` is searched for the
 | 
	
		
			
				|  |  | +search terms, but it is possible to specify other fields in the query syntax:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where the `status` field contains `active`
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    status:active
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where the `title` field contains `quick` or `brown`
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    title:(quick brown)
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where the `author` field contains the exact phrase `"john smith"`
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    author:"John Smith"
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where any of the fields `book.title`, `book.content` or `book.date` contains
 | 
	
		
			
				|  |  | +  `quick` or `brown` (note how we need to escape the `*` with a backslash):
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    book.\*:(quick brown)
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where the field `title` has no value (or is missing):
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    _missing_:title
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* where the field `title` has any non-null value:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    _exists_:title
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Wildcards
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Wildcard searches can be run on individual terms, using `?` to replace
 | 
	
		
			
				|  |  | +a single character, and `*` to replace zero or more characters:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    qu?ck bro*
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Be aware that wildcard queries can use an enormous amount of memory and
 | 
	
		
			
				|  |  | +perform very badly -- just think how many terms need to be queried to
 | 
	
		
			
				|  |  | +match the query string `"a* b* c*"`.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +[WARNING]
 | 
	
		
			
				|  |  | +======
 | 
	
		
			
				|  |  | +Allowing a wildcard at the beginning of a word (eg `"*ing"`) is particularly
 | 
	
		
			
				|  |  | +heavy, because all terms in the index need to be examined, just in case
 | 
	
		
			
				|  |  | +they match.  Leading wildcards can be disabled by setting
 | 
	
		
			
				|  |  | +`allow_leading_wildcard` to `false`.
 | 
	
		
			
				|  |  | +======
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Wildcarded terms are not analyzed by default -- they are lowercased
 | 
	
		
			
				|  |  | +(`lowercase_expanded_terms` defaults to `true`) but no further analysis
 | 
	
		
			
				|  |  | +is done, mainly because it is impossible to accurately analyze a word that
 | 
	
		
			
				|  |  | +is missing some of its letters.  However, by setting `analyze_wildcard` to
 | 
	
		
			
				|  |  | +`true`, an attempt will be made to analyze wildcarded words before searching
 | 
	
		
			
				|  |  | +the term list for matching terms.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Regular expressions
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Regular expression patterns can be embedded in the query string by
 | 
	
		
			
				|  |  | +wrapping them in forward-slashes (`"/"`):
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    name:/joh?n(ath[oa]n)/
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The supported regular expression syntax is explained in <<regexp-syntax>>.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +[WARNING]
 | 
	
		
			
				|  |  | +======
 | 
	
		
			
				|  |  | +The `allow_leading_wildcard` parameter does not have any control over
 | 
	
		
			
				|  |  | +regular expressions.  A query string such as the following would force
 | 
	
		
			
				|  |  | +Elasticsearch to visit every term in the index:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    /.*n/
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Use with caution!
 | 
	
		
			
				|  |  | +======
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Fuzziness
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +We can search for terms that are
 | 
	
		
			
				|  |  | +similar to, but not exactly like our search terms, using the ``fuzzy''
 | 
	
		
			
				|  |  | +operator:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    quikc~ brwn~ foks~
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +This uses the
 | 
	
		
			
				|  |  | +http://en.wikipedia.org/wiki/Damerau-Levenshtein_distance[Damerau-Levenshtein distance]
 | 
	
		
			
				|  |  | +to find all terms with a maximum of
 | 
	
		
			
				|  |  | +two changes, where a change is the insertion, deletion
 | 
	
		
			
				|  |  | +or substitution of a single character, or transposition of two adjacent
 | 
	
		
			
				|  |  | +characters.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The default _edit distance_ is `2`, but an edit distance of `1` should be
 | 
	
		
			
				|  |  | +sufficient to catch 80% of all human misspellings. It can be specified as:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    quikc~1
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Proximity searches
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +While a phrase query (eg `"john smith"`) expects all of the terms in exactly
 | 
	
		
			
				|  |  | +the same order, a proximity query allows the specified words to be further
 | 
	
		
			
				|  |  | +apart or in a different order.  In the same way that fuzzy queries can
 | 
	
		
			
				|  |  | +specify a maximum edit distance for characters in a word, a proximity search
 | 
	
		
			
				|  |  | +allows us to specify a maximum edit distance of words in a phrase:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    "fox quick"~5
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The closer the text in a field is to the original order specified in the
 | 
	
		
			
				|  |  | +query string, the more relevant that document is considered to be. When
 | 
	
		
			
				|  |  | +compared to the above example query, the phrase `"quick fox"` would be
 | 
	
		
			
				|  |  | +considered more relevant than `"quick brown fox"`.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Ranges
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Ranges can be specified for date, numeric or string fields. Inclusive ranges
 | 
	
		
			
				|  |  | +are specified with square brackets `[min TO max]` and exclusive ranges with
 | 
	
		
			
				|  |  | +curly brackets `{min TO max}`.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* All days in 2012:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    date:[2012/01/01 TO 2012/12/31]
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* Numbers 1..5
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    count:[1 TO 5]
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* Tags between `alpha` and `omega`, excluding `alpha` and `omega`:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    tag:{alpha TO omega}
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* Numbers from 10 upwards
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    count:[10 TO *]
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* Dates before 2012
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    date:{* TO 2012/01/01}
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The parsing of ranges in query strings can be complex and error prone. It is
 | 
	
		
			
				|  |  | +much more reliable to use an explicit <<query-dsl-range-filter,`range` filter>>.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Boosting
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Use the _boost_ operator `^` to make one term more relevant than another.
 | 
	
		
			
				|  |  | +For instance, if we want to find all documents about foxes, but we are
 | 
	
		
			
				|  |  | +especially interested in quick foxes:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    quick^2 fox
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The default `boost` value is 1, but can be any positive floating point number.
 | 
	
		
			
				|  |  | +Boosts between 0 and 1 reduce relevance.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Boosts can also be applied to phrases or to groups:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    "john smith"^2   (foo bar)^4
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Boolean operators
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +By default, all terms are optional, as long as one term matches.  A search
 | 
	
		
			
				|  |  | +for `foo bar baz` will find any document that contains one or more of
 | 
	
		
			
				|  |  | +`foo` or `bar` or `baz`.  We have already discussed the `default_operator`
 | 
	
		
			
				|  |  | +above which allows you to force all terms to be required, but there are
 | 
	
		
			
				|  |  | +also _boolean operators_ which can be used in the query string itself
 | 
	
		
			
				|  |  | +to provide more control.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The preferred operators are `+` (this term *must* be present) and `-`
 | 
	
		
			
				|  |  | +(this term *must not* be present). All other terms are optional.
 | 
	
		
			
				|  |  | +For example, this query:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    quick brown +fox -news
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +states that:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +* `fox` must be present
 | 
	
		
			
				|  |  | +* `news` must not be present
 | 
	
		
			
				|  |  | +* `quick` and `brown` are optional -- their presence increases the relevance
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The familiar operators `AND`, `OR` and `NOT` (also written `&&`, `||` and `!`)
 | 
	
		
			
				|  |  | +are also supported.  However, the effects of these operators can be more
 | 
	
		
			
				|  |  | +complicated than is obvious at first glance.  `NOT` takes precedence over
 | 
	
		
			
				|  |  | +`AND`, which takes precedence over `OR`.  While the `+` and `-` only affect
 | 
	
		
			
				|  |  | +the term to the right of the operator, `AND` and `OR` can affect the terms to
 | 
	
		
			
				|  |  | +the left and right.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +****
 | 
	
		
			
				|  |  | +Rewriting the above query using `AND`, `OR` and `NOT` demonstrates the
 | 
	
		
			
				|  |  | +complexity:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +`quick OR brown AND fox AND NOT news`::
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +This is incorrect, because `brown` is now a required term.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +`(quick OR brown) AND fox AND NOT news`::
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +This is incorrect because at least one of `quick` or `brown` is now required
 | 
	
		
			
				|  |  | +and the search for those terms would be scored differently from the original
 | 
	
		
			
				|  |  | +query.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +`((quick AND fox) OR (brown AND fox) OR fox) AND NOT news`::
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +This form now replicates the logic from the original query correctly, but
 | 
	
		
			
				|  |  | +the relevance scoring bares little resemblance to the original.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +In contrast, the same query rewritten using the <<query-dsl-match-query,`match` query>>
 | 
	
		
			
				|  |  | +would look like this:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    {
 | 
	
		
			
				|  |  | +        "bool": {
 | 
	
		
			
				|  |  | +            "must":     { "match": "fox"         },
 | 
	
		
			
				|  |  | +            "should":   { "match": "quick brown" },
 | 
	
		
			
				|  |  | +            "must_not": { "match": "news"        }
 | 
	
		
			
				|  |  | +        }
 | 
	
		
			
				|  |  | +    }
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +****
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Grouping
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Multiple terms or clauses can be grouped together with parentheses, to form
 | 
	
		
			
				|  |  | +sub-queries:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    (quick OR brown) AND fox
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Groups can be used to target a particular field, or to boost the result
 | 
	
		
			
				|  |  | +of a sub-query:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +    status:(active OR pending) title:(full text search)^2
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +===== Reserved characters
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +If you need to use any of the characters which function as operators in your
 | 
	
		
			
				|  |  | +query itself (and not as operators), then you should escape them with
 | 
	
		
			
				|  |  | +a leading backslash. For instance, to search for `(1+1)=2`, you would
 | 
	
		
			
				|  |  | +need to write your query as `\(1\+1\)=2`.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +The reserved characters are:  `+ - && || ! ( ) { } [ ] ^ " ~ * ? : \ /`
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Failing to escape these special characters correctly could lead to a syntax
 | 
	
		
			
				|  |  | +error which prevents your query from running.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +.Watch this space
 | 
	
		
			
				|  |  | +****
 | 
	
		
			
				|  |  | +A space may also be a reserved character.  For instance, if you have a
 | 
	
		
			
				|  |  | +synonym list which converts `"wi fi"` to `"wifi"`, a `query_string` search
 | 
	
		
			
				|  |  | +for `"wi fi"` would fail. The query string parser would interpret your
 | 
	
		
			
				|  |  | +query as a search for `"wi OR fi"`, while the token stored in your
 | 
	
		
			
				|  |  | +index is actually `"wifi"`.  Escaping the space will protect it from
 | 
	
		
			
				|  |  | +being touched by the query string parser: `"wi\ fi"`.
 | 
	
		
			
				|  |  | +****
 |