Browse Source

[DOCS] Add retrievers overview (#107959)

Liam Thompson 1 year ago
parent
commit
b2ebaeee7b

+ 207 - 0
docs/reference/search/search-your-data/retrievers-overview.asciidoc

@@ -0,0 +1,207 @@
+[[retrievers-overview]]
+== Retrievers 
+
+// Will move to a top level "Retrievers and reranking" section once reranking is live
+
+preview::[] 
+
+A retriever is an abstraction that was added to the Search API in *8.14.0*.
+This abstraction enables the configuration of multi-stage retrieval 
+pipelines within a single `_search` call. This simplifies your search 
+application logic, because you no longer need to configure complex searches via 
+multiple {es} calls or implement additional client-side logic to 
+combine results from different queries.
+
+This document provides a general overview of the retriever abstraction. 
+For implementation details, including notable restrictions, check out the 
+<<retriever,reference documentation>> in the `_search` API docs. 
+
+[discrete]
+[[retrievers-overview-types]]
+=== Retriever types 
+
+Retrievers come in various types, each tailored for different search operations.
+The following retrievers are currently available: 
+
+* <<standard-retriever,*Standard Retriever*>>. Returns top documents from a 
+traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query]. 
+Mimics a traditional query but in the context of a retriever framework. This 
+ensures backward compatibility as existing `_search` requests remain supported. 
+That way you can transition to the new abstraction at your own pace without 
+mixing syntaxes.
+* <<knn-retriever,*kNN Retriever*>>. Returns top documents from a <<search-api-knn,knn search>>, 
+in the context of a retriever framework.
+* <<rrf-retriever,*RRF Retriever*>>. Combines and ranks multiple first-stage retrievers using
+the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets 
+with different relevance indicators into a single result set.
+An RRF retriever is a *compound retriever*, where its `filter` element is 
+propagated to its sub retrievers.
++
+Sub retrievers may not use elements that 
+are restricted by having a compound retriever as part of the retriever tree.
+See the <<rrf-using-multiple-standard-retrievers,RRF documentation>> for detailed
+examples and information on how to use the RRF retriever.
+
+[NOTE]
+====
+Stay tuned for more retriever types in future releases!
+====
+
+[discrete]
+=== What makes retrievers useful? 
+
+Here's an overview of what makes retrievers useful and how they differ from 
+regular queries. 
+
+. *Simplified user experience*. Retrievers simplify the user experience by 
+allowing entire retrieval pipelines to be configured in a single API call. This 
+maintains backward compatibility with traditional query elements by 
+automatically translating them to the appropriate retriever.
+. *Structured retrieval*. Retrievers provide a more structured way to define search 
+operations. They allow searches to be described using a "retriever tree", a 
+hierarchical structure that clarifies the sequence and logic of operations, 
+making complex searches more understandable and manageable.
+. *Composability and flexibility*. Retrievers enable flexible composability, 
+allowing you to build pipelines and seamlessly integrate different retrieval 
+strategies into these pipelines. Retrievers make it easy to test out different 
+retrieval strategy combinations.
+. *Compound operations*. A retriever can have sub retrievers. This 
+allows complex nested searches where the results of one retriever feed into 
+another, supporting sophisticated querying strategies that might involve 
+multiple stages or criteria.
+. *Retrieval as a first-class concept*. Unlike 
+traditional queries, where the query is a part of a larger search API call, 
+retrievers are designed as standalone entities that can be combined or used in 
+isolation. This enables a more modular and flexible approach to constructing 
+searches.
+. *Enhanced control over document scoring and ranking*. Retrievers 
+allow for more explicit control over how documents are scored and filtered. For 
+instance, you can specify minimum score thresholds, apply complex filters 
+without affecting scoring, and use parameters like `terminate_after` for 
+performance optimizations.
+. *Integration with existing {es} functionalities*. Even though 
+retrievers can be used instead of existing `_search` API syntax (like the 
+`query` and `knn`), they are designed to integrate seamlessly with things like
+pagination (`search_after`) and sorting. They also maintain compatibility with 
+aggregation operations by treating the combination of all leaf retrievers as 
+`should` clauses in a boolean query.
+. *Cleaner separation of concerns*. When using compound retrievers, only the 
+query element is allowed, which enforces a cleaner separation of concerns 
+and prevents the complexity that might arise from overly nested or 
+interdependent configurations.
+
+[discrete]
+[[retrievers-overview-example]]
+=== Example
+
+The following example demonstrates how using retrievers 
+simplify the composability of queries for RRF ranking.
+
+[source,js]
+----
+GET example-index/_search
+{
+  "retriever": {
+    "rrf": {
+      "retrievers": [
+        {
+          "standard": {
+            "query": {
+              "text_expansion": {
+                "vector.tokens": {
+                  "model_id": ".elser_model_2",
+                  "model_text": "What blue shoes are on sale?"
+                }
+              }
+            }
+          }
+        },
+        {
+          "standard": {
+            "query": {
+              "match": {
+                "text": "blue shoes sale"
+              }
+            }
+          }
+        }
+      ]
+    }
+  }
+}
+----
+//NOTCONSOLE
+
+This example demonstrates how you can combine different
+retrieval strategies into a single `retriever` pipeline.
+
+Compare to `RRF` with `sub_searches` approach:
+
+.*Expand* for example
+[%collapsible]
+==============
+
+[source,js]
+----
+GET example-index/_search
+{
+  "sub_searches":[
+    {
+      "query":{
+        "match":{
+          "text":"blue shoes sale"
+        }
+      }
+    },
+    {
+      "query":{
+        "text_expansion":{
+          "vector.tokens":{
+            "model_id":".elser_model_2",
+            "model_text":"What blue shoes are on sale?"
+          }
+        }
+      }
+    }
+  ],
+  "rank":{
+    "rrf":{
+      "window_size":50,
+      "rank_constant":20
+    }
+  }
+}
+----
+//NOTCONSOLE
+==============
+
+[discrete]
+[[retrievers-overview-glossary]]
+=== Glossary
+
+Here are some important terms: 
+
+* *Retrieval Pipeline*. Defines the entire retrieval and ranking logic to 
+produce top hits.
+* *Retriever Tree*. A hierarchical structure that defines how retrievers interact.
+* *First-stage Retriever*. Returns an initial set of candidate documents.
+* *Compound Retriever*. Builds on one or more retrievers, 
+enhancing document retrieval and ranking logic.
+* *Combiners*. Compound retrievers that merge top hits 
+from multiple sub-retrievers. 
+//* NOT YET *Rerankers*. Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers.
+
+[discrete]
+[[retrievers-overview-play-in-search]]
+=== Retrievers in action
+
+The Search Playground builds Elasticsearch queries using the retriever abstraction.
+It automatically detects the fields and types in your index and builds a retriever tree based on your selections.
+
+You can use the Playground to experiment with different retriever configurations and see how they affect search results.
+
+Refer to the {kibana-ref}/playground.html[Playground documentation] for more information.
+// Content coming in https://github.com/elastic/kibana/pull/182692
+
+
+

+ 2 - 1
docs/reference/search/search-your-data/search-your-data.asciidoc

@@ -43,10 +43,11 @@ DSL, with a simplified user experience. Create search applications based on your
 results directly in the Kibana Search UI.
 
 include::search-api.asciidoc[]
-include::search-application-overview.asciidoc[]
 include::knn-search.asciidoc[]
 include::semantic-search.asciidoc[]
+include::retrievers-overview.asciidoc[]
 include::learning-to-rank.asciidoc[]
 include::search-across-clusters.asciidoc[]
 include::search-with-synonyms.asciidoc[]
+include::search-application-overview.asciidoc[]
 include::behavioral-analytics/behavioral-analytics-overview.asciidoc[]