浏览代码

[Docs] Improve documentation of the new caching policy for filters.

Adrien Grand 10 年之前
父节点
当前提交
fb6c3b7c29
共有 1 个文件被更改,包括 61 次插入22 次删除
  1. 61 22
      docs/reference/query-dsl/filters.asciidoc

+ 61 - 22
docs/reference/query-dsl/filters.asciidoc

@@ -15,37 +15,76 @@ filter does not require a lot of memory, and will cause other queries
 executing against the same filter (same parameters) to be blazingly
 executing against the same filter (same parameters) to be blazingly
 fast.
 fast.
 
 
-Some filters already produce a result that is easily cacheable, and the
-difference between caching and not caching them is the act of placing
-the result in the cache or not. These filters, which include the
-<<query-dsl-term-filter,term>>,
+However the cost of caching is not the same for all filters. For
+instance some filters are already fast out of the box while caching could
+add significant overhead, and some filters produce results that are already
+cacheable so caching them is just a matter of putting the result in the
+cache.
+
+The default caching policy, `_cache: auto`, tracks the 1000 most recently
+used filters on a per-index basis and makes decisions based on their
+frequency.
+
+[float]
+==== Filters that read directly the index structure
+
+Some filters can directly read the index structure and potentially jump
+over large sequences of documents that are not worth evaluating (for
+instance when these documents do not match the query). Caching these
+filters introduces overhead given that all documents that the filter
+matches need to be consumed in order to be loaded into the cache.
+
+These filters, which include the <<query-dsl-term-filter,term>> and
+<<query-dsl-term-query,query>> filters, are only cached after they
+appear 5 times or more in the history of the 1000 most recently used
+filters.
+
+[float]
+==== Filters that produce results that are already cacheable
+
+Some filters produce results that are already cacheable, and the difference
+between caching and not caching them is the act of placing the result in
+the cache or not. These filters, which include the
 <<query-dsl-terms-filter,terms>>,
 <<query-dsl-terms-filter,terms>>,
 <<query-dsl-prefix-filter,prefix>>, and
 <<query-dsl-prefix-filter,prefix>>, and
-<<query-dsl-range-filter,range>> filters, are by
-default cached and are recommended to use (compared to the equivalent
-query version) when the same filter (same parameters) will be used
-across multiple different queries (for example, a range filter with age
-higher than 10).
-
-Other filters, usually already working with the field data loaded into
-memory, are not cached by default. Those filters are already very fast,
-and the process of caching them requires extra processing in order to
-allow the filter result to be used with different queries than the one
-executed. These filters, including the geo,
-and <<query-dsl-script-filter,script>> filters
-are not cached by default.
-
-The last type of filters are those working with other filters. The
+<<query-dsl-range-filter,range>> filters, are by default cached after they
+appear twice or more in the history of the most 1000 recently used filters.
+
+[float]
+==== Computational filters
+
+Some filters need to run some computation in order to figure out whether
+a given document matches a filter. These filters, which include the geo and
+<<query-dsl-script-filter,script>> filters, but also the
+<<query-dsl-terms-filter,terms>>  and <<query-dsl-range-filter,range>>
+filters when using the `fielddata` execution mode are never cached by default,
+as it would require to evaluate the filter on all documents in your indices
+while they can otherwise be only evaluated on documents that match the query.
+
+[float]
+==== Compound filters
+
+The last type of filters are those working with other filters, and includes
+the <<query-dsl-bool-filter,bool>>,
 <<query-dsl-and-filter,and>>,
 <<query-dsl-and-filter,and>>,
 <<query-dsl-not-filter,not>> and
 <<query-dsl-not-filter,not>> and
-<<query-dsl-or-filter,or>> filters are not
-cached as they basically just manipulate the internal filters.
+<<query-dsl-or-filter,or>> filters.
+
+There is no general rule about these filters. Depending on the filters that
+they wrap, they will sometimes return a filter that dynamically evaluates the
+sub filters and sometimes evaluate the sub filters eagerly in order to return
+a result that is already cacheable, so depending on the case, these filters
+will be cached after they appear 2+ or 5+ times in the history of the most
+1000 recently used filters.
+
+[float]
+==== Overriding the default behaviour
 
 
 All filters allow to set `_cache` element on them to explicitly control
 All filters allow to set `_cache` element on them to explicitly control
 caching. It accepts 3 values: `true` in order to cache the filter, `false`
 caching. It accepts 3 values: `true` in order to cache the filter, `false`
 to make sure that the filter will not be cached, and `auto`, which is the
 to make sure that the filter will not be cached, and `auto`, which is the
 default and will decide on whether to cache the filter based on the cost
 default and will decide on whether to cache the filter based on the cost
-to cache the filter and how often the filter has been used.
+to cache it and how often it has been used as explained above.
 
 
 Filters also allow to set `_cache_key` which will be used as the
 Filters also allow to set `_cache_key` which will be used as the
 caching key for that filter. This can be handy when using very large
 caching key for that filter. This can be handy when using very large