Browse Source

`indices.query.bool.max_clause_count` now limits all query clauses (#75297)

In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is
going to apply to the entire query tree rather than per `bool` query. In order
to avoid breaks, the limit has been bumped from 1024 to 4096.

The semantics will effectively change when we upgrade to Lucene 9, this PR
is only about agreeing on a migration strategy and documenting this change.

To avoid further breaks, I am leaning towards keeping the current setting name
even though it contains `bool`. I believe that it still makes sense given that
`bool` queries are typically the main contributors to high numbers of clauses.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Adrien Grand 4 years ago
parent
commit
feb6620d14

+ 1 - 1
docs/reference/mapping/mapping-settings-limit.asciidoc

@@ -14,7 +14,7 @@ especially in clusters with a high load or few resources.
 
 If you increase this setting, we recommend you also increase the
 <<search-settings,`indices.query.bool.max_clause_count`>> setting, which
-limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
+limits the maximum number of clauses in a query.
 ====
 +
 [TIP]

+ 16 - 0
docs/reference/migration/migrate_8_0/search.asciidoc

@@ -20,6 +20,22 @@ Aggregating and sorting on `_id` should be avoided. As an alternative, the
 `_id` field's contents can be duplicated into another field with docvalues
 enabled (note that this does not apply to auto-generated IDs).
 ====
+
+[[max_clause_count_change]]
+.The `indices.query.bool.max_clause_count` setting now limits all query clauses.
+[%collapsible]
+====
+*Details* +
+Previously, the `indices.query.bool.max_clause_count` would apply to the number
+of clauses of a single `bool` query. It now applies to the total number of
+clauses of the rewritten query. In order to reduce chances of breaks, its
+default value has been bumped from 1024 to 4096.
+
+*Impact* +
+Queries with many clauses should be avoided whenever possible. If you had bumped
+this setting already in order to accomodate for some heavy queries, you might
+need to bump it further so that these heavy queries keep working.
+====
 //end::notable-breaking-changes[]
 
 .Search-related REST API endpoints containing mapping types have been removed.

+ 15 - 9
docs/reference/modules/indices/search-settings.asciidoc

@@ -7,16 +7,22 @@ limits.
 [[indices-query-bool-max-clause-count]]
 `indices.query.bool.max_clause_count`::
 (<<static-cluster-setting,Static>>, integer)
-Maximum number of clauses a Lucene BooleanQuery can contain. Defaults to `1024`.
+Maximum number of clauses a query can contain. Defaults to `4096`.
 +
-This setting limits the number of clauses a Lucene BooleanQuery can have. The
-default of 1024 is quite high and should normally be sufficient. This limit does
-not only affect Elasticsearchs `bool` query, but many other queries are rewritten to Lucene's
-BooleanQuery internally. The limit is in place to prevent searches from becoming too large
-and taking up too much CPU and memory. In case you're considering increasing this setting,
-make sure you've exhausted all other options to avoid having to do this. Higher values can lead 
-to performance degradations and memory issues, especially in clusters with a high load or 
-few resources.
+This setting limits the total number of clauses that a query tree can have. The default of 4096
+is quite high and should normally be sufficient. This limit applies to the rewritten query, so
+not only `bool` queries can contribute high numbers of clauses, but also all queries that rewrite
+to `bool` queries internally such as `fuzzy` queries. The limit is in place to prevent searches
+from becoming too large, and taking up too much CPU and memory. In case you're considering
+increasing this setting, make sure you've exhausted all other options to avoid having to do this.
+Higher values can lead to performance degradations and memory issues, especially in clusters with
+a high load or few resources.
+
+Elasticsearch offers some tools to avoid running into issues with regards to the maximum number of
+clauses such as the <<query-dsl-terms-query,`terms`>> query, which allows querying many distinct
+values while still counting as a single clause, or the <<index-prefixes,`index_prefixes`>> option
+of <<text-field-type,`text`>> fields, which allows executing prefix queries that expand to a high
+number of terms as a single term query.
 
 [[search-settings-max-buckets]]
 `search.max_buckets`::

+ 3 - 3
docs/reference/query-dsl/combined-fields-query.asciidoc

@@ -37,9 +37,9 @@ model perfectly.)
 [WARNING]
 .Field number limit
 ===================================================
-There is a limit on the number of fields that can be queried at once. It is
-defined by the `indices.query.bool.max_clause_count` <<search-settings>>
-which defaults to 1024.
+There is a limit on the number of fields times terms that can be queried at
+once. It is defined by the `indices.query.bool.max_clause_count`
+<<search-settings>> which defaults to 4096.
 ===================================================
 
 ==== Per-field boosting

+ 2 - 2
docs/reference/query-dsl/multi-match-query.asciidoc

@@ -67,9 +67,9 @@ index settings, which in turn defaults to `*`. `*` extracts all fields in the ma
 are eligible to term queries and filters the metadata fields. All extracted fields are then
 combined to build a query.
 
-WARNING: There is a limit on the number of fields that can be queried
+WARNING: There is a limit on the number of fields times terms that can be queried
 at once. It is defined by the `indices.query.bool.max_clause_count` <<search-settings>>
-which defaults to 1024.
+which defaults to 4096.
 
 [[multi-match-types]]
 [discrete]

+ 2 - 2
docs/reference/query-dsl/query-string-query.asciidoc

@@ -77,9 +77,9 @@ documents.
 For mappings with a large number of fields, searching across all eligible fields
 could be expensive.
 
-There is a limit on the number of fields that can be queried at once.
+There is a limit on the number of fields times terms that can be queried at once.
 It is defined by the `indices.query.bool.max_clause_count`
-<<search-settings,search setting>>, which defaults to 1024.
+<<search-settings,search setting>>, which defaults to 4096.
 ====
 --
 

+ 1 - 1
docs/reference/query-dsl/span-multi-term-query.asciidoc

@@ -39,7 +39,7 @@ GET /_search
 --------------------------------------------------
 
 WARNING: `span_multi` queries will hit too many clauses failure if the number of terms that match the query exceeds the
-boolean query limit (defaults to 1024).To avoid an unbounded expansion you can set the <<query-dsl-multi-term-rewrite,
+boolean query limit (defaults to 4096).To avoid an unbounded expansion you can set the <<query-dsl-multi-term-rewrite,
 rewrite method>> of the multi term query to `top_terms_*` rewrite. Or, if you use `span_multi` on `prefix` query only,
 you can activate the <<index-prefixes,`index_prefixes`>> field option of the `text` field instead. This will
 rewrite any prefix query on the field to a single term query that matches the indexed prefix.

+ 1 - 1
server/src/main/java/org/elasticsearch/search/SearchModule.java

@@ -263,7 +263,7 @@ import static java.util.Objects.requireNonNull;
  */
 public class SearchModule {
     public static final Setting<Integer> INDICES_MAX_CLAUSE_COUNT_SETTING = Setting.intSetting("indices.query.bool.max_clause_count",
-            1024, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);
+            4096, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);
 
     public static final Setting<Integer> INDICES_MAX_NESTED_DEPTH_SETTING = Setting.intSetting("indices.query.bool.max_nested_depth",
         20, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);