|
@@ -0,0 +1,207 @@
|
|
|
+[[retrievers-overview]]
|
|
|
+== Retrievers
|
|
|
+
|
|
|
+// Will move to a top level "Retrievers and reranking" section once reranking is live
|
|
|
+
|
|
|
+preview::[]
|
|
|
+
|
|
|
+A retriever is an abstraction that was added to the Search API in *8.14.0*.
|
|
|
+This abstraction enables the configuration of multi-stage retrieval
|
|
|
+pipelines within a single `_search` call. This simplifies your search
|
|
|
+application logic, because you no longer need to configure complex searches via
|
|
|
+multiple {es} calls or implement additional client-side logic to
|
|
|
+combine results from different queries.
|
|
|
+
|
|
|
+This document provides a general overview of the retriever abstraction.
|
|
|
+For implementation details, including notable restrictions, check out the
|
|
|
+<<retriever,reference documentation>> in the `_search` API docs.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[retrievers-overview-types]]
|
|
|
+=== Retriever types
|
|
|
+
|
|
|
+Retrievers come in various types, each tailored for different search operations.
|
|
|
+The following retrievers are currently available:
|
|
|
+
|
|
|
+* <<standard-retriever,*Standard Retriever*>>. Returns top documents from a
|
|
|
+traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query].
|
|
|
+Mimics a traditional query but in the context of a retriever framework. This
|
|
|
+ensures backward compatibility as existing `_search` requests remain supported.
|
|
|
+That way you can transition to the new abstraction at your own pace without
|
|
|
+mixing syntaxes.
|
|
|
+* <<knn-retriever,*kNN Retriever*>>. Returns top documents from a <<search-api-knn,knn search>>,
|
|
|
+in the context of a retriever framework.
|
|
|
+* <<rrf-retriever,*RRF Retriever*>>. Combines and ranks multiple first-stage retrievers using
|
|
|
+the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets
|
|
|
+with different relevance indicators into a single result set.
|
|
|
+An RRF retriever is a *compound retriever*, where its `filter` element is
|
|
|
+propagated to its sub retrievers.
|
|
|
++
|
|
|
+Sub retrievers may not use elements that
|
|
|
+are restricted by having a compound retriever as part of the retriever tree.
|
|
|
+See the <<rrf-using-multiple-standard-retrievers,RRF documentation>> for detailed
|
|
|
+examples and information on how to use the RRF retriever.
|
|
|
+
|
|
|
+[NOTE]
|
|
|
+====
|
|
|
+Stay tuned for more retriever types in future releases!
|
|
|
+====
|
|
|
+
|
|
|
+[discrete]
|
|
|
+=== What makes retrievers useful?
|
|
|
+
|
|
|
+Here's an overview of what makes retrievers useful and how they differ from
|
|
|
+regular queries.
|
|
|
+
|
|
|
+. *Simplified user experience*. Retrievers simplify the user experience by
|
|
|
+allowing entire retrieval pipelines to be configured in a single API call. This
|
|
|
+maintains backward compatibility with traditional query elements by
|
|
|
+automatically translating them to the appropriate retriever.
|
|
|
+. *Structured retrieval*. Retrievers provide a more structured way to define search
|
|
|
+operations. They allow searches to be described using a "retriever tree", a
|
|
|
+hierarchical structure that clarifies the sequence and logic of operations,
|
|
|
+making complex searches more understandable and manageable.
|
|
|
+. *Composability and flexibility*. Retrievers enable flexible composability,
|
|
|
+allowing you to build pipelines and seamlessly integrate different retrieval
|
|
|
+strategies into these pipelines. Retrievers make it easy to test out different
|
|
|
+retrieval strategy combinations.
|
|
|
+. *Compound operations*. A retriever can have sub retrievers. This
|
|
|
+allows complex nested searches where the results of one retriever feed into
|
|
|
+another, supporting sophisticated querying strategies that might involve
|
|
|
+multiple stages or criteria.
|
|
|
+. *Retrieval as a first-class concept*. Unlike
|
|
|
+traditional queries, where the query is a part of a larger search API call,
|
|
|
+retrievers are designed as standalone entities that can be combined or used in
|
|
|
+isolation. This enables a more modular and flexible approach to constructing
|
|
|
+searches.
|
|
|
+. *Enhanced control over document scoring and ranking*. Retrievers
|
|
|
+allow for more explicit control over how documents are scored and filtered. For
|
|
|
+instance, you can specify minimum score thresholds, apply complex filters
|
|
|
+without affecting scoring, and use parameters like `terminate_after` for
|
|
|
+performance optimizations.
|
|
|
+. *Integration with existing {es} functionalities*. Even though
|
|
|
+retrievers can be used instead of existing `_search` API syntax (like the
|
|
|
+`query` and `knn`), they are designed to integrate seamlessly with things like
|
|
|
+pagination (`search_after`) and sorting. They also maintain compatibility with
|
|
|
+aggregation operations by treating the combination of all leaf retrievers as
|
|
|
+`should` clauses in a boolean query.
|
|
|
+. *Cleaner separation of concerns*. When using compound retrievers, only the
|
|
|
+query element is allowed, which enforces a cleaner separation of concerns
|
|
|
+and prevents the complexity that might arise from overly nested or
|
|
|
+interdependent configurations.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[retrievers-overview-example]]
|
|
|
+=== Example
|
|
|
+
|
|
|
+The following example demonstrates how using retrievers
|
|
|
+simplify the composability of queries for RRF ranking.
|
|
|
+
|
|
|
+[source,js]
|
|
|
+----
|
|
|
+GET example-index/_search
|
|
|
+{
|
|
|
+ "retriever": {
|
|
|
+ "rrf": {
|
|
|
+ "retrievers": [
|
|
|
+ {
|
|
|
+ "standard": {
|
|
|
+ "query": {
|
|
|
+ "text_expansion": {
|
|
|
+ "vector.tokens": {
|
|
|
+ "model_id": ".elser_model_2",
|
|
|
+ "model_text": "What blue shoes are on sale?"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "standard": {
|
|
|
+ "query": {
|
|
|
+ "match": {
|
|
|
+ "text": "blue shoes sale"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ ]
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
+//NOTCONSOLE
|
|
|
+
|
|
|
+This example demonstrates how you can combine different
|
|
|
+retrieval strategies into a single `retriever` pipeline.
|
|
|
+
|
|
|
+Compare to `RRF` with `sub_searches` approach:
|
|
|
+
|
|
|
+.*Expand* for example
|
|
|
+[%collapsible]
|
|
|
+==============
|
|
|
+
|
|
|
+[source,js]
|
|
|
+----
|
|
|
+GET example-index/_search
|
|
|
+{
|
|
|
+ "sub_searches":[
|
|
|
+ {
|
|
|
+ "query":{
|
|
|
+ "match":{
|
|
|
+ "text":"blue shoes sale"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "query":{
|
|
|
+ "text_expansion":{
|
|
|
+ "vector.tokens":{
|
|
|
+ "model_id":".elser_model_2",
|
|
|
+ "model_text":"What blue shoes are on sale?"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "rank":{
|
|
|
+ "rrf":{
|
|
|
+ "window_size":50,
|
|
|
+ "rank_constant":20
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
+//NOTCONSOLE
|
|
|
+==============
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[retrievers-overview-glossary]]
|
|
|
+=== Glossary
|
|
|
+
|
|
|
+Here are some important terms:
|
|
|
+
|
|
|
+* *Retrieval Pipeline*. Defines the entire retrieval and ranking logic to
|
|
|
+produce top hits.
|
|
|
+* *Retriever Tree*. A hierarchical structure that defines how retrievers interact.
|
|
|
+* *First-stage Retriever*. Returns an initial set of candidate documents.
|
|
|
+* *Compound Retriever*. Builds on one or more retrievers,
|
|
|
+enhancing document retrieval and ranking logic.
|
|
|
+* *Combiners*. Compound retrievers that merge top hits
|
|
|
+from multiple sub-retrievers.
|
|
|
+//* NOT YET *Rerankers*. Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[retrievers-overview-play-in-search]]
|
|
|
+=== Retrievers in action
|
|
|
+
|
|
|
+The Search Playground builds Elasticsearch queries using the retriever abstraction.
|
|
|
+It automatically detects the fields and types in your index and builds a retriever tree based on your selections.
|
|
|
+
|
|
|
+You can use the Playground to experiment with different retriever configurations and see how they affect search results.
|
|
|
+
|
|
|
+Refer to the {kibana-ref}/playground.html[Playground documentation] for more information.
|
|
|
+// Content coming in https://github.com/elastic/kibana/pull/182692
|
|
|
+
|
|
|
+
|
|
|
+
|