Browse Source

[DOCS] Rewrite terms query (#42889)

James Rodewig 6 years ago
parent
commit
cb527c2ece
2 changed files with 212 additions and 80 deletions
  1. 1 0
      docs/reference/index-modules.asciidoc
  2. 211 80
      docs/reference/query-dsl/terms-query.asciidoc

+ 1 - 0
docs/reference/index-modules.asciidoc

@@ -199,6 +199,7 @@ specific index module:
      This setting is only applicable when highlighting is requested on a text that was indexed without offsets or term vectors.
      Defaults to `1000000`.
 
+[[index-max-terms-count]]
  `index.max_terms_count`::
 
     The maximum number of terms that can be used in Terms Query.

+ 211 - 80
docs/reference/query-dsl/terms-query.asciidoc

@@ -1,121 +1,252 @@
 [[query-dsl-terms-query]]
 === Terms Query
 
-Filters documents that have fields that match any of the provided terms
-(*not analyzed*). For example:
+Returns documents that contain one or more *exact* terms in a provided field.
+
+The `terms` query is the same as the <<query-dsl-term-query, `term` query>>,
+except you can search for multiple values.
+
+[[terms-query-ex-request]]
+==== Example request
+
+The following search returns documents where the `user` field contains `kimchy`
+or `elasticsearch`.
 
 [source,js]
---------------------------------------------------
+----
 GET /_search
 {
-    "query": {
-        "terms" : { "user" : ["kimchy", "elasticsearch"]}
+    "query" : {
+        "terms" : {
+            "user" : ["kimchy", "elasticsearch"],
+            "boost" : 1.0 
+        }
     }
 }
---------------------------------------------------
+----
 // CONSOLE
 
-NOTE: Highlighting `terms` queries is best-effort only, so terms of a `terms`
-query might not be highlighted depending on the highlighter implementation that
-is selected and on the number of terms in the `terms` query.
+[[terms-top-level-params]]
+==== Top-level parameters for `terms`
+`<field>`::
++
+--
+Field you wish to search.
+
+The value of this parameter is an array of terms you wish to find in the
+provided field. To return a document, one or more terms must exactly match a
+field value, including whitespace and capitalization.
+
+By default, {es} limits the `terms` query to a maximum of 65,536
+terms. You can change this limit using the <<index-max-terms-count,
+`index.max_terms_count`>> setting.
+
+[NOTE]
+To use the field values of an existing document as search terms, use the
+<<query-dsl-terms-lookup, terms lookup>> parameters.
+--
+
+`boost`::
++
+--
+Floating point number used to decrease or increase the
+<<query-filter-context, relevance scores>> of a query. Default is `1.0`.
+Optional.
+
+You can use the `boost` parameter to adjust relevance scores for searches
+containing two or more queries.
+
+Boost values are relative to the default value of `1.0`. A boost value between
+`0` and `1.0` decreases the relevance score. A value greater than `1.0`
+increases the relevance score.
+--
+
+[[terms-query-notes]]
+==== Notes
+
+[[query-dsl-terms-query-highlighting]]
+===== Highlighting `terms` queries
+<<search-request-highlighting,Highlighting>> is best-effort only. {es} may not
+return highlight results for `terms` queries depending on:
+
+* Highlighter type
+* Number of terms in the query
 
-[float]
 [[query-dsl-terms-lookup]]
-===== Terms lookup mechanism
+===== Terms lookup
+Terms lookup fetches the field values of an existing document. {es} then uses
+those values as search terms. This can be helpful when searching for a large set
+of terms.
 
-When it's needed to specify a `terms` filter with a lot of terms it can
-be beneficial to fetch those term values from a document in an index. A
-concrete example would be to filter tweets tweeted by your followers.
-Potentially the amount of user ids specified in the terms filter can be
-a lot. In this scenario it makes sense to use the terms filter's terms
-lookup mechanism.
+Because terms lookup fetches values from a document, the <<mapping-source-field,
+`_source`>> mapping field must be enabled to use terms lookup. The `_source`
+field is enabled by default.
 
-The terms lookup mechanism supports the following options:
+[NOTE]
+By default, {es} limits the `terms` query to a maximum of 65,536
+terms. This includes terms fetched using terms lookup. You can change
+this limit using the <<index-max-terms-count, `index.max_terms_count`>> setting.
 
-[horizontal]
+To perform a terms lookup, use the following parameters.
+
+[[query-dsl-terms-lookup-params]]
+====== Terms lookup parameters
 `index`::
-    The index to fetch the term values from.
+Name of the index from which to fetch field values.
 
 `id`::
-    The id of the document to fetch the term values from.
+<<mapping-id-field,ID>> of the document from which to fetch field values.
 
 `path`::
-    The field specified as path to fetch the actual values for the
-    `terms` filter.
++
+--
+Name of the field from which to fetch field values. {es} uses
+these values as search terms for the query.
+
+If the field values include an array of nested inner objects, you can access
+those objects using dot notation syntax.
+--
 
 `routing`::
-    A custom routing value to be used when retrieving the
-    external terms doc.
-
-The values for the `terms` filter will be fetched from a field in a
-document with the specified id in the specified type and index.
-Internally a get request is executed to fetch the values from the
-specified path. At the moment for this feature to work the `_source`
-needs to be stored.
-
-Also, consider using an index with a single shard and fully replicated
-across all nodes if the "reference" terms data is not large. The lookup
-terms filter will prefer to execute the get request on a local node if
-possible, reducing the need for networking.
-
-[WARNING]
-Executing a Terms Query request with a lot of terms can be quite slow,
-as each additional term demands extra processing and memory.
-To safeguard against this, the maximum number of terms that can be used
-in a Terms Query both directly or through lookup has been limited to `65536`.
-This default maximum can be changed for a particular index with the index setting
- `index.max_terms_count`.
-
-[float]
-===== Terms lookup twitter example
-At first we index the information for user with id 2, specifically, its
-followers, then index a tweet from user with id 1. Finally we search on
-all the tweets that match the followers of user 2.
+Custom <<mapping-routing-field, routing value>> of the document from which to
+fetch term values. If a custom routing value was provided when the document was
+indexed, this parameter is required.
+
+[[query-dsl-terms-lookup-example]]
+====== Terms lookup example
+
+To see how terms lookup works, try the following example.  
+
+. Create an index with a `keyword` field named `color`.
++
+--
 
 [source,js]
---------------------------------------------------
-PUT /users/_doc/2
+----
+PUT my_index
 {
-    "followers" : ["1", "3"]
+    "mappings" : {
+        "properties" : {
+            "color" : { "type" : "keyword" }
+        }
+    }
 }
+----
+// CONSOLE
+--
 
-PUT /tweets/_doc/1
+. Index a document with an ID of 1 and values of `["blue", "green"]` in the
+`color` field.
++
+--
+
+[source,js]
+----
+PUT my_index/_doc/1
 {
-    "user" : "1"
+  "color":   ["blue", "green"]
 }
+----
+// CONSOLE
+// TEST[continued]
+--
 
-GET /tweets/_search
+. Index another document with an ID of 2 and value of `blue` in the `color`
+field.
++
+--
+
+[source,js]
+----
+PUT my_index/_doc/2
 {
-    "query" : {
-        "terms" : {
-            "user" : {
-                "index" : "users",
-                "id" : "2",
-                "path" : "followers"
-            }
-        }
-    }
+  "color":   "blue"
 }
---------------------------------------------------
+----
+// CONSOLE
+// TEST[continued]
+--
+
+. Use the `terms` query with terms lookup parameters to find documents
+containing one or more of the same terms as document 2. Include the `pretty`
+parameter so the response is more readable.
++
+--
+
+////
+
+[source,js]
+----
+POST my_index/_refresh
+----
 // CONSOLE
+// TEST[continued]
 
-The structure of the external terms document can also include an array of
-inner objects, for example:
+////
 
 [source,js]
---------------------------------------------------
-PUT /users/_doc/2
+----
+GET my_index/_search?pretty
 {
- "followers" : [
-   {
-     "id" : "1"
-   },
-   {
-     "id" : "2"
-   }
- ]
+  "query": {
+    "terms": {
+        "color" : {
+            "index" : "my_index",
+            "id" : "2",
+            "path" : "color"
+        }
+    }
+  }
 }
---------------------------------------------------
+----
 // CONSOLE
+// TEST[continued]
+
+Because document 2 and document 1 both contain `blue` as a value in the `color`
+field, {es} returns both documents.
 
-In which case, the lookup path will be `followers.id`.
+[source,js]
+----
+{
+  "took" : 17,
+  "timed_out" : false,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  },
+  "hits" : {
+    "total" : {
+      "value" : 2,
+      "relation" : "eq"
+    },
+    "max_score" : 1.0,
+    "hits" : [
+      {
+        "_index" : "my_index",
+        "_type" : "_doc",
+        "_id" : "1",
+        "_score" : 1.0,
+        "_source" : {
+          "color" : [
+            "blue",
+            "green"
+          ]
+        }
+      },
+      {
+        "_index" : "my_index",
+        "_type" : "_doc",
+        "_id" : "2",
+        "_score" : 1.0,
+        "_source" : {
+          "color" : "blue"
+        }
+      }
+    ]
+  }
+}
+----
+// TESTRESPONSE[s/"took" : 17/"took" : $body.took/]
+--