123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355 |
- [[query-dsl-percolate-query]]
- === Percolate Query
- The `percolate` query can be used to match queries
- stored in an index. The `percolate` query itself
- contains the document that will be used as query
- to match with the stored queries.
- [float]
- === Sample Usage
- Create an index with two mappings:
- [source,js]
- --------------------------------------------------
- PUT /my-index
- {
- "mappings": {
- "doctype": {
- "properties": {
- "message": {
- "type": "string"
- }
- }
- },
- "queries": {
- "properties": {
- "query": {
- "type": "percolator"
- }
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- The `doctype` mapping is the mapping used to preprocess
- the document defined in the `percolator` query before it
- gets indexed into a temporary index.
- The `queries` mapping is the mapping used for indexing
- the query documents. The `query` field will hold a json
- object that represents an actual Elasticsearch query. The
- `query` field has been configured to use the
- <<percolator,percolator field type>>. This field type understands
- the query dsl and stored the query in such a way that it
- can be used later on to match documents defined on the `percolate` query.
- Register a query in the percolator:
- [source,js]
- --------------------------------------------------
- PUT /my-index/queries/1
- {
- "query" : {
- "match" : {
- "message" : "bonsai tree"
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- Match a document to the registered percolator queries:
- [source,js]
- --------------------------------------------------
- GET /my-index/_search
- {
- "query" : {
- "percolate" : {
- "field" : "query",
- "document_type" : "doctype",
- "document" : {
- "message" : "A new bonsai tree in the office"
- }
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- The above request will yield the following response:
- [source,js]
- --------------------------------------------------
- {
- "took": 5,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "failed": 0
- },
- "hits": {
- "total": 1,
- "max_score": 0.5716521,
- "hits": [
- { <1>
- "_index": "my-index",
- "_type": "queries",
- "_id": "1",
- "_score": 0.5716521,
- "_source": {
- "query": {
- "match": {
- "message": "bonsai tree"
- }
- }
- }
- }
- ]
- }
- }
- --------------------------------------------------
- <1> The query with id `1` matches our document.
- [float]
- ==== Parameters
- The following parameters are required when percolating a document:
- [horizontal]
- `field`:: The field of type `percolator` and that holds the indexed queries. This is a required parameter.
- `document_type`:: The type / mapping of the document being percolated. This is a required parameter.
- `document`:: The source of the document being percolated.
- Instead of specifying a the source of the document being percolated, the source can also be retrieved from an already
- stored document. The `percolate` query will then internally execute a get request to fetch that document.
- In that case the `document` parameter can be substituted with the following parameters:
- [horizontal]
- `index`:: The index the document resides in. This is a required parameter.
- `type`:: The type of the document to fetch. This is a required parameter.
- `id`:: The id of the document to fetch. This is a required parameter.
- `routing`:: Optionally, routing to be used to fetch document to percolate.
- `preference`:: Optionally, preference to be used to fetch document to percolate.
- `version`:: Optionally, the expected version of the document to be fetched.
- [float]
- ==== Percolating an Existing Document
- In order to percolate a newly indexed document, the `percolate` query can be used. Based on the response
- from an index request, the `_id` and other meta information can be used to immediately percolate the newly added
- document.
- [float]
- ===== Example
- Based on the previous example.
- Index the document we want to percolate:
- [source,js]
- --------------------------------------------------
- PUT /my-index/message/1
- {
- "message" : "A new bonsai tree in the office"
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- Index response:
- [source,js]
- --------------------------------------------------
- {
- "_index": "my-index",
- "_type": "message",
- "_id": "1",
- "_version": 1,
- "_shards": {
- "total": 2,
- "successful": 1,
- "failed": 0
- },
- "created": true
- }
- --------------------------------------------------
- Percolating an existing document, using the index response as basis to build to new search request:
- [source,js]
- --------------------------------------------------
- GET /my-index/_search
- {
- "query" : {
- "percolate" : {
- "field": "query",
- "document_type" : "doctype",
- "index" : "my-index",
- "type" : "message",
- "id" : "1",
- "version" : 1 <1>
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- <1> The version is optional, but useful in certain cases. We can then ensure that we are try to percolate
- the document we just have indexed. A change may be made after we have indexed, and if that is the
- case the then the search request would fail with a version conflict error.
- The search response returned is identical as in the previous example.
- [float]
- ==== Percolate query and highlighting
- The `percolate` query is handled in a special way when it comes to highlighting. The queries hits are used
- to highlight the document that is provided in the `percolate` query. Whereas with regular highlighting the query in
- the search request is used to highlight the hits.
- [float]
- ===== Example
- This example is based on the mapping of the first example.
- Save a query:
- [source,js]
- --------------------------------------------------
- PUT /my-index/queries/1
- {
- "query" : {
- "match" : {
- "message" : "brown fox"
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- Save another query:
- [source,js]
- --------------------------------------------------
- PUT /my-index/queries/2
- {
- "query" : {
- "match" : {
- "message" : "lazy dog"
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- Execute a search request with the `percolate` query and highlighting enabled:
- [source,js]
- --------------------------------------------------
- GET /my-index/_search
- {
- "query" : {
- "percolate" : {
- "field": "query",
- "document_type" : "doctype",
- "document" : {
- "message" : "The quick brown fox jumps over the lazy dog"
- }
- }
- },
- "highlight": {
- "fields": {
- "message": {}
- }
- }
- }
- --------------------------------------------------
- // CONSOLE
- // TEST[continued]
- This will yield the following response.
- [source,js]
- --------------------------------------------------
- {
- "took": 83,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "failed": 0
- },
- "hits": {
- "total": 2,
- "max_score": 0.5446649,
- "hits": [
- {
- "_index": "my-index",
- "_type": "queries",
- "_id": "2",
- "_score": 0.5446649,
- "_source": {
- "query": {
- "match": {
- "message": "lazy dog"
- }
- }
- },
- "highlight": {
- "message": [
- "The quick brown fox jumps over the <em>lazy</em> <em>dog</em>" <1>
- ]
- }
- },
- {
- "_index": "my-index",
- "_type": "queries",
- "_id": "1",
- "_score": 0.5446649,
- "_source": {
- "query": {
- "match": {
- "message": "brown fox"
- }
- }
- },
- "highlight": {
- "message": [
- "The quick <em>brown</em> <em>fox</em> jumps over the lazy dog" <1>
- ]
- }
- }
- ]
- }
- }
- --------------------------------------------------
- Instead of the query in the search request highlighting the percolator hits, the percolator queries are highlighting
- the document defined in the `percolate` query.
- [float]
- ==== How it Works Under the Hood
- When indexing a document into an index that has the <<percolator,percolator field type>> mapping configured, the query
- part of the documents gets parsed into a Lucene query and is kept in memory until that percolator document is removed.
- So, all the active percolator queries are kept in memory.
- At search time, the document specified in the request gets parsed into a Lucene document and is stored in a in-memory
- temporary Lucene index. This in-memory index can just hold this one document and it is optimized for that. Then all the queries
- that are registered to the index that the search request is targeted for, are going to be executed on this single document
- in-memory index. This happens on each shard the search request needs to execute.
- By using `routing` or additional queries the amount of percolator queries that need to be executed can be reduced and thus
- the time the search API needs to run can be decreased.
|