mapped_pages:
A retriever is a specification to describe top documents returned from a search. A retriever replaces other elements of the search API that also return top documents such as query
and knn
. A retriever may have child retrievers where a retriever with two or more children is considered a compound retriever. This allows for complex behavior to be depicted in a tree-like structure, called the retriever tree, which clarifies the order of operations that occur during a search.
::::{tip} Refer to Retrievers for a high level overview of the retrievers abstraction. Refer to Retrievers examples for additional examples.
::::
::::{admonition} New API reference For the most up-to-date API details, refer to Search APIs.
::::
The following retrievers are available:
standard
: A retriever that replaces the functionality of a traditional query.
knn
: A retriever that replaces the functionality of a knn search.
linear
: A retriever that linearly combines the scores of other retrievers for the top documents.
rescorer
: A retriever that replaces the functionality of the query rescorer.
rrf
: A retriever that produces top documents from reciprocal rank fusion (RRF).
text_similarity_reranker
: A retriever that enhances search results by re-ranking documents based on semantic similarity to a specified inference text, using a machine learning model.
rule
: A retriever that applies contextual Searching with query rules to pin or exclude documents for specific queries.
A standard retriever returns top documents from a traditional query.
query
: (Optional, query object)
Defines a query to retrieve a set of top documents.
filter
: (Optional, query object or list of query objects)
Applies a [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to this retriever, where all documents must match this query but do not contribute to the score.
search_after
: (Optional, search after object)
Defines a search after object parameter used for pagination.
terminate_after
: (Optional, integer) Maximum number of documents to collect for each shard. If a query reaches this limit, {{es}} terminates the query early. {{es}} collects documents before sorting.
::::{important}
Use with caution. {{es}} applies this parameter to each shard handling the request. When possible, let {{es}} perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.
::::
sort
: (Optional, sort object) A sort object that specifies the order of matching documents.
min_score
: (Optional, float
)
Minimum [`_score`](/reference/query-languages/query-dsl/query-filter-context.md#relevance-scores) for matching documents. Documents with a lower `_score` are not included in the top documents.
collapse
: (Optional, collapse object)
Collapses the top documents by a specified key into a single top document per key.
When a retriever tree contains a compound retriever (a retriever with two or more child retrievers) the search after parameter is not supported.
GET /restaurants/_search
{
"retriever": { <1>
"standard": { <2>
"query": { <3>
"bool": { <4>
"should": [ <5>
{
"match": { <6>
"region": "Austria"
}
}
],
"filter": [ <7>
{
"term": { <8>
"year": "2019" <9>
}
}
]
}
}
}
}
}
retriever
object.standard
retriever is used for defining traditional {{es}} queries.bool
object allows for combining multiple query clauses logically.should
array indicates conditions under which a document will match. Documents matching these conditions will have increased relevancy scores.match
object finds documents where the region
field contains the word "Austria."filter
array provides filtering conditions that must be met but do not contribute to the relevancy score.term
object is used for exact matches, in this case, filtering documents by the year
field.year
field.A kNN retriever returns top documents from a k-nearest neighbor search (kNN).
field
: (Required, string)
The name of the vector field to search against. Must be a [`dense_vector` field with indexing enabled](/reference/elasticsearch/mapping-reference/dense-vector.md#index-vectors-knn-search).
query_vector
: (Required if query_vector_builder
is not defined, array of float
)
Query vector. Must have the same number of dimensions as the vector field you are searching against. Must be either an array of floats or a hex-encoded byte vector.
query_vector_builder
: (Required if query_vector
is not defined, query vector builder object)
Defines a [model](docs-content://solutions/search/vector/knn.md#knn-semantic-search) to build a query vector.
k
: (Required, integer)
Number of nearest neighbors to return as top hits. This value must be fewer than or equal to `num_candidates`.
num_candidates
: (Required, integer)
The number of nearest neighbor candidates to consider per shard. Needs to be greater than `k`, or `size` if `k` is omitted, and cannot exceed 10,000. {{es}} collects `num_candidates` results from each shard, then merges them to find the top `k` results. Increasing `num_candidates` tends to improve the accuracy of the final `k` results. Defaults to `Math.min(1.5 * k, 10_000)`.
filter
: (Optional, query object or list of query objects)
Query to filter the documents that can match. The kNN search will return the top `k` documents that also match this filter. The value can be a single query or a list of queries. If `filter` is not provided, all documents are allowed to match.
similarity
: (Optional, float)
The minimum similarity required for a document to be considered a match. The similarity value calculated relates to the raw [`similarity`](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) used. Not the document score. The matched documents are then scored according to [`similarity`](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) and the provided `boost` is applied.
The `similarity` parameter is the direct vector similarity calculation.
* `l2_norm`: also known as Euclidean, will include documents where the vector is within the `dims` dimensional hypersphere with radius `similarity` with origin at `query_vector`.
* `cosine`, `dot_product`, and `max_inner_product`: Only return vectors where the cosine similarity or dot-product are at least the provided `similarity`.
Read more here: [knn similarity search](docs-content://solutions/search/vector/knn.md#knn-similarity-search)
rescore_vector
: (Optional, object) Functionality in [preview]. Apply oversampling and rescoring to quantized vectors.
::::{note}
Rescoring only makes sense for quantized vectors; when quantization is not used, the original vectors are used for scoring. Rescore option will be ignored for non-quantized dense_vector
fields.
::::
oversample
: (Required, float)
Applies the specified oversample factor to `k` on the approximate kNN search. The approximate kNN search will:
* Retrieve `num_candidates` candidates per shard.
* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors.
* The top `k` rescored candidates will be returned.
See oversampling and rescoring quantized vectors for details.
The parameters query_vector
and query_vector_builder
cannot be used together.
GET /restaurants/_search
{
"retriever": {
"knn": { <1>
"field": "vector", <2>
"query_vector": [10, 22, 77], <3>
"k": 10, <4>
"num_candidates": 10 <5>
}
}
}
knn
search.num_candidates
.k
nearest neighbors are selected.A retriever that normalizes and linearly combines the scores of other retrievers.
retrievers
: (Required, array of objects)
A list of the sub-retrievers' configuration, that we will take into account and whose result sets we will merge through a weighted sum. Each configuration can have a different weight and normalization depending on the specified retriever.
Each entry specifies the following parameters:
retriever
:: (Required, a retriever
object)
Specifies the retriever for which we will compute the top documents for. The retriever will produce rank_window_size
results, which will later be merged based on the specified weight
and normalizer
.
weight
:: (Optional, float)
The weight that each score of this retriever’s top docs will be multiplied with. Must be greater or equal to 0. Defaults to 1.0.
normalizer
:: (Optional, String)
Specifies how we will normalize the retriever’s scores, before applying the specified weight
. Available values are: minmax
, and none
. Defaults to none
.
none
minmax
: A MinMaxScoreNormalizer
that normalizes scores based on the following formula
score = (score - min) / (max - min)
See also this hybrid search example using a linear retriever on how to independently configure and apply normalizers to retrievers.
rank_window_size
: (Optional, integer)
This value determines the size of the individual result sets per query. A higher value will improve result relevance at the cost of performance. The final ranked result set is pruned down to the search request’s [size](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param). `rank_window_size` must be greater than or equal to `size` and greater than or equal to `1`. Defaults to the `size` parameter.
filter
: (Optional, query object or list of query objects)
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
An RRF retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers. Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
retrievers
: (Required, array of retriever objects)
A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them. Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
rank_constant
: (Optional, integer)
This value determines how much influence documents in individual result sets per query have over the final ranked result set. A higher value indicates that lower ranked documents have more influence. This value must be greater than or equal to `1`. Defaults to `60`.
rank_window_size
: (Optional, integer)
This value determines the size of the individual result sets per query. A higher value will improve result relevance at the cost of performance. The final ranked result set is pruned down to the search request’s [size](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param). `rank_window_size` must be greater than or equal to `size` and greater than or equal to `1`. Defaults to the `size` parameter.
filter
: (Optional, query object or list of query objects)
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
A simple hybrid search example (lexical search + dense vector search) combining a standard
retriever with a knn
retriever using RRF:
GET /restaurants/_search
{
"retriever": {
"rrf": { <1>
"retrievers": [ <2>
{
"standard": { <3>
"query": {
"multi_match": {
"query": "Austria",
"fields": [
"city",
"region"
]
}
}
}
},
{
"knn": { <4>
"field": "vector",
"query_vector": [10, 22, 77],
"k": 10,
"num_candidates": 10
}
}
],
"rank_constant": 1, <5>
"rank_window_size": 50 <6>
}
}
}
standard
retriever.knn
retriever.A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF:
GET movies/_search
{
"retriever": {
"rrf": {
"retrievers": [
{
"standard": {
"query": {
"sparse_vector": {
"field": "plot_embedding",
"inference_id": "my-elser-model",
"query": "films that explore psychological depths"
}
}
}
},
{
"standard": {
"query": {
"multi_match": {
"query": "crime",
"fields": [
"plot",
"title"
]
}
}
}
},
{
"knn": {
"field": "vector",
"query_vector": [10, 22, 77],
"k": 10,
"num_candidates": 10
}
}
]
}
}
}
The rescorer
retriever re-scores only the results produced by its child retriever. For the standard
and knn
retrievers, the window_size
parameter specifies the number of documents examined per shard.
For compound retrievers like rrf
, the window_size
parameter defines the total number of documents examined globally.
When using the rescorer
, an error is returned if the following conditions are not met:
The minimum configured rescore’s window_size
is:
size
of the parent retriever for nested rescorer
setups.size
of the search request when used as the primary retriever in the tree.And the maximum rescore’s window_size
is:
size
or rank_window_size
of the child retriever.rescore
: (Required. A rescorer definition or an array of rescorer definitions)
Defines the [rescorers](/reference/elasticsearch/rest-apis/filter-search-results.md#rescore) applied sequentially to the top documents returned by the child retriever.
retriever
: (Required. retriever
)
Specifies the child retriever responsible for generating the initial set of top documents to be re-ranked.
filter
: (Optional. query object or list of query objects)
Applies a [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to the retriever, ensuring that all documents match the filter criteria without affecting their scores.
The rescorer
retriever can be placed at any level within the retriever tree. The following example demonstrates a rescorer
applied to the results produced by an rrf
retriever:
GET movies/_search
{
"size": 10, <1>
"retriever": {
"rescorer": { <2>
"rescore": {
"window_size": 50, <3>
"query": { <4>
"rescore_query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.queryVector, 'product-vector_final_stage') + 1.0",
"params": {
"queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
}
}
}
}
}
},
"retriever": { <5>
"rrf": {
"rank_window_size": 100, <6>
"retrievers": [
{
"standard": {
"query": {
"sparse_vector": {
"field": "plot_embedding",
"inference_id": "my-elser-model",
"query": "films that explore psychological depths"
}
}
}
},
{
"standard": {
"query": {
"multi_match": {
"query": "crime",
"fields": [
"plot",
"title"
]
}
}
}
},
{
"knn": {
"field": "vector",
"query_vector": [10, 22, 77],
"k": 10,
"num_candidates": 10
}
}
]
}
}
}
}
}
rescorer
retriever applied as the final step.query
rescorer.rrf
retriever, which limits the available documents toThe text_similarity_reranker
retriever uses an NLP model to improve search results by reordering the top-k documents based on their semantic similarity to the query.
::::{tip} Refer to Semantic re-ranking for a high level overview of semantic re-ranking.
::::
To use text_similarity_reranker
you must first set up an inference endpoint for the rerank
task using the Create {{infer}} API. The endpoint should be set up with a machine learning model that can compute text similarity. Refer to the Elastic NLP model reference for a list of third-party text similarity models supported by {{es}}.
You have the following options:
rerank
task type.rerank
task type.Upload a model to {{es}} with Eland using the text_similarity
NLP task type.
rerank
task type.::::{important} Scores from the re-ranking process are normalized using the following formula before returned to the user, to avoid having negative scores.
score = max(score, 0) + min(exp(score), 1)
Using the above, any initially negative scores are projected to (0, 1) and positive scores to [1, infinity). To revert back if needed, one can use:
score = score - 1, if score >= 0
score = ln(score), if score < 0
::::
retriever
: (Required, retriever
)
The child retriever that generates the initial set of top documents to be re-ranked.
field
: (Required, string
)
The document field to be used for text similarity comparisons. This field should contain the text that will be evaluated against the `inferenceText`.
inference_id
: (Required, string
)
Unique identifier of the inference endpoint created using the {{infer}} API.
inference_text
: (Required, string
)
The text snippet used as the basis for similarity comparison.
rank_window_size
: (Optional, int
)
The number of top documents to consider in the re-ranking process. Defaults to `10`.
min_score
: (Optional, float
)
Sets a minimum threshold score for including documents in the re-ranked results. Documents with similarity scores below this threshold will be excluded. Note that score calculations vary depending on the model used.
filter
: (Optional, query object or list of query objects)
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to the child `retriever`. If the child retriever already specifies any filters, then this top-level filter is applied in conjuction with the filter defined in the child retriever.
::::{tip} Refer to this Python notebook for an end-to-end example using Elastic Rerank.
::::
This example demonstrates how to deploy the Elastic Rerank model and use it to re-rank search results using the text_similarity_reranker
retriever.
Follow these steps:
Create an inference endpoint for the rerank
task using the Create {{infer}} API.
PUT _inference/rerank/my-elastic-rerank
{
"service": "elasticsearch",
"service_settings": {
"model_id": ".rerank-v1",
"num_threads": 1,
"adaptive_allocations": { <1>
"enabled": true,
"min_number_of_allocations": 1,
"max_number_of_allocations": 10
}
}
}
Define a text_similarity_rerank
retriever:
POST _search
{
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"match": {
"text": "How often does the moon hide the sun?"
}
}
}
},
"field": "text",
"inference_id": "my-elastic-rerank",
"inference_text": "How often does the moon hide the sun?",
"rank_window_size": 100,
"min_score": 0.5
}
}
}
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminates the need to generate and store embeddings for all indexed documents. This requires a Cohere Rerank inference endpoint that is set up for the rerank
task type.
GET /index/_search
{
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"match_phrase": {
"text": "landmark in Paris"
}
}
}
},
"field": "text",
"inference_id": "my-cohere-rerank-model",
"inference_text": "Most famous landmark in Paris",
"rank_window_size": 100,
"min_score": 0.5
}
}
}
The following example uses the cross-encoder/ms-marco-MiniLM-L-6-v2
model from Hugging Face to rerank search results based on semantic similarity. The model must be uploaded to {{es}} using Eland.
::::{tip} Refer to the Elastic NLP model reference for a list of third party text similarity models supported by {{es}}.
::::
Follow these steps to load the model and create a semantic re-ranker.
Install Eland using pip
python -m pip install eland[pytorch]
Upload the model to {{es}} using Eland. This example assumes you have an Elastic Cloud deployment and an API key. Refer to the Eland documentation for more authentication options.
eland_import_hub_model \
--cloud-id $CLOUD_ID \
--es-api-key $ES_API_KEY \
--hub-model-id cross-encoder/ms-marco-MiniLM-L-6-v2 \
--task-type text_similarity \
--clear-previous \
--start
Create an inference endpoint for the rerank
task
PUT _inference/rerank/my-msmarco-minilm-model
{
"service": "elasticsearch",
"service_settings": {
"num_allocations": 1,
"num_threads": 1,
"model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
}
}
Define a text_similarity_rerank
retriever.
POST movies/_search
{
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"match": {
"genre": "drama"
}
}
}
},
"field": "plot",
"inference_id": "my-msmarco-minilm-model",
"inference_text": "films that explore psychological depths"
}
}
}
This retriever uses a standard match
query to search the movie
index for films tagged with the genre "drama". It then re-ranks the results based on semantic similarity to the text in the inference_text
parameter, using the model we uploaded to {{es}}.
The rule
retriever enables fine-grained control over search results by applying contextual query rules to pin or exclude documents for specific queries. This retriever has similar functionality to the rule query, but works out of the box with other retrievers.
To use the rule
retriever you must first create one or more query rulesets using the query rules management APIs.
retriever
: (Required, retriever
)
The child retriever that returns the results to apply query rules on top of. This can be a standalone retriever such as the [standard](#standard-retriever) or [knn](#knn-retriever) retriever, or it can be a compound retriever.
ruleset_ids
: (Required, array
)
An array of one or more unique [query ruleset](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-query_rules) IDs with query-based rules to match and apply as applicable. Rulesets and their associated rules are evaluated in the order in which they are specified in the query and ruleset. The maximum number of rulesets to specify is 10.
match_criteria
: (Required, object
)
Defines the match criteria to apply to rules in the given query ruleset(s). Match criteria should match the keys defined in the `criteria.metadata` field of the rule.
rank_window_size
: (Optional, int
)
The number of top documents to return from the `rule` retriever. Defaults to `10`.
This example shows the rule retriever executed without any additional retrievers. It runs the query defined by the retriever
and applies the rules from my-ruleset
on top of the returned results.
GET movies/_search
{
"retriever": {
"rule": {
"match_criteria": {
"query_string": "harry potter"
},
"ruleset_ids": [
"my-ruleset"
],
"retriever": {
"standard": {
"query": {
"query_string": {
"query": "harry potter"
}
}
}
}
}
}
}
This example shows how to combine the rule
retriever with other rerank retrievers such as rrf or text_similarity_reranker.
::::{warning}
The rule
retriever will apply rules to any documents returned from its defined retriever
or any of its sub-retrievers. This means that for the best results, the rule
retriever should be the outermost defined retriever. Nesting a rule
retriever as a sub-retriever under a reranker such as rrf
or text_similarity_reranker
may not produce the expected results.
::::
GET movies/_search
{
"retriever": {
"rule": { <1>
"match_criteria": {
"query_string": "harry potter"
},
"ruleset_ids": [
"my-ruleset"
],
"retriever": {
"rrf": { <2>
"retrievers": [
{
"standard": {
"query": {
"query_string": {
"query": "sorcerer's stone"
}
}
}
},
{
"standard": {
"query": {
"query_string": {
"query": "chamber of secrets"
}
}
}
}
]
}
}
}
}
}
rule
retriever is the outermost retriever, applying rules to the search results that were previously reranked using the rrf
retriever.rrf
retriever returns results from all of its sub-retrievers, and the output of the rrf
retriever is used as input to the rule
retriever.from
and size
with a retriever tree [retriever-size-pagination]The from
and size
parameters are provided globally as part of the general search API. They are applied to all retrievers in a retriever tree, unless a specific retriever overrides the size
parameter using a different parameter such as rank_window_size
. Though, the final search hits are always limited to size
.
Aggregations are globally specified as part of a search request. The query used for an aggregation is the combination of all leaf retrievers as should
clauses in a boolean query.
When a retriever is specified as part of a search, the following elements are not allowed at the top-level:
query
knn
search_after
terminate_after
sort
rescore
use a rescorer retriever instead