|
@@ -31,13 +31,170 @@ Other versions:
|
|
|
|
|
|
endif::[]
|
|
|
|
|
|
-// The notable-highlights tag marks entries that
|
|
|
-// should be featured in the Stack Installation and Upgrade Guide:
|
|
|
// tag::notable-highlights[]
|
|
|
-// [discrete]
|
|
|
-// === Heading
|
|
|
-//
|
|
|
-// Description.
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[esql_inlinestats]]
|
|
|
+=== ESQL: INLINESTATS
|
|
|
+This adds the `INLINESTATS` command to ESQL which performs a STATS and
|
|
|
+then enriches the results into the output stream. So, this query:
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+FROM test
|
|
|
+| INLINESTATS m=MAX(a * b) BY b
|
|
|
+| WHERE m == a * b
|
|
|
+| SORT a DESC, b DESC
|
|
|
+| LIMIT 3
|
|
|
+----
|
|
|
+
|
|
|
+Produces output like:
|
|
|
+
|
|
|
+| a | b | m |
|
|
|
+| --- | --- | ----- |
|
|
|
+| 99 | 999 | 98901 |
|
|
|
+| 99 | 998 | 98802 |
|
|
|
+| 99 | 997 | 98703 |
|
|
|
+
|
|
|
+{es-pull}109583[#109583]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[always_allow_rebalancing_by_default]]
|
|
|
+=== Always allow rebalancing by default
|
|
|
+In earlier versions of {es} the `cluster.routing.allocation.allow_rebalance` setting defaults to
|
|
|
+`indices_all_active` which blocks all rebalancing moves while the cluster is in `yellow` or `red` health. This was
|
|
|
+appropriate for the legacy allocator which might do too many rebalancing moves otherwise. Today's allocator has
|
|
|
+better support for rebalancing a cluster that is not in `green` health, and expects to be able to rebalance some
|
|
|
+shards away from over-full nodes to avoid allocating shards to undesirable locations in the first place. From
|
|
|
+version 8.16 `allow_rebalance` setting defaults to `always` unless the legacy allocator is explicitly enabled.
|
|
|
+
|
|
|
+{es-pull}111015[#111015]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[add_global_retention_in_data_stream_lifecycle]]
|
|
|
+=== Add global retention in data stream lifecycle
|
|
|
+Data stream lifecycle now supports configuring retention on a cluster level,
|
|
|
+namely global retention. Global retention \nallows us to configure two different
|
|
|
+retentions:
|
|
|
+
|
|
|
+- `data_streams.lifecycle.retention.default` is applied to all data streams managed
|
|
|
+by the data stream lifecycle that do not have retention defined on the data stream level.
|
|
|
+- `data_streams.lifecycle.retention.max` is applied to all data streams managed by the
|
|
|
+data stream lifecycle and it allows any data stream \ndata to be deleted after the `max_retention` has passed.
|
|
|
+
|
|
|
+{es-pull}111972[#111972]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[enable_zstandard_compression_for_indices_with_index_codec_set_to_best_compression]]
|
|
|
+=== Enable ZStandard compression for indices with index.codec set to best_compression
|
|
|
+Before DEFLATE compression was used to compress stored fields in indices with index.codec index setting set to
|
|
|
+best_compression, with this change ZStandard is used as compression algorithm to stored fields for indices with
|
|
|
+index.codec index setting set to best_compression. The usage ZStandard results in less storage usage with a
|
|
|
+similar indexing throughput depending on what options are used. Experiments with indexing logs have shown that
|
|
|
+ZStandard offers ~12% lower storage usage and a ~14% higher indexing throughput compared to DEFLATE.
|
|
|
+
|
|
|
+{es-pull}112665[#112665]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[8_x_remove_zstd_feature_flag_for_index_codec_best_compression]]
|
|
|
+=== [8.x] Remove zstd feature flag for index codec best compression
|
|
|
+Backports the following commits to 8.x: - Remove zstd feature flag for
|
|
|
+index codec best compression. (#112665)
|
|
|
+
|
|
|
+{es-pull}112857[#112857]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[esql_introduce_per_agg_filter]]
|
|
|
+=== ESQL: Introduce per agg filter
|
|
|
+Add support for aggregation scoped filters that work dynamically on the
|
|
|
+data in each group.
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+| STATS success = COUNT(*) WHERE 200 <= code AND code < 300,
|
|
|
+ redirect = COUNT(*) WHERE 300 <= code AND code < 400,
|
|
|
+ client_err = COUNT(*) WHERE 400 <= code AND code < 500,
|
|
|
+ server_err = COUNT(*) WHERE 500 <= code AND code < 600,
|
|
|
+ total_count = COUNT(*)
|
|
|
+----
|
|
|
+
|
|
|
+Implementation wise, the base AggregateFunction has been extended to
|
|
|
+allow a filter to be passed on. This is required to incorporate the
|
|
|
+filter as part of the aggregate equality/identity which would fail with
|
|
|
+the filter as an external component.
|
|
|
+As part of the process, the serialization for the existing aggregations
|
|
|
+had to be fixed so AggregateFunction implementations so that it
|
|
|
+delegates to their parent first.
|
|
|
+
|
|
|
+{es-pull}113735[#113735]
|
|
|
+
|
|
|
// end::notable-highlights[]
|
|
|
|
|
|
|
|
|
+[discrete]
|
|
|
+[[esql_multi_value_fields_supported_in_geospatial_predicates]]
|
|
|
+=== ESQL: Multi-value fields supported in Geospatial predicates
|
|
|
+Supporting multi-value fields in `WHERE` predicates is a challenge due to not knowing whether `ALL` or `ANY`
|
|
|
+of the values in the field should pass the predicate.
|
|
|
+For example, should the field `age:[10,30]` pass the predicate `WHERE age>20` or not?
|
|
|
+This ambiguity does not exist with the spatial predicates
|
|
|
+`ST_INTERSECTS` and `ST_DISJOINT`, because the choice between `ANY` or `ALL`
|
|
|
+is implied by the predicate itself.
|
|
|
+Consider a predicate checking a field named `location` against a test geometry named `shape`:
|
|
|
+
|
|
|
+* `ST_INTERSECTS(field, shape)` - true if `ANY` value can intersect the shape
|
|
|
+* `ST_DISJOINT(field, shape)` - true only if `ALL` values are disjoint from the shape
|
|
|
+
|
|
|
+This works even if the shape argument is itself a complex or compound geometry.
|
|
|
+
|
|
|
+Similar logic exists for `ST_CONTAINS` and `ST_WITHIN` predicates, but these are not as easily solved
|
|
|
+with `ANY` or `ALL`, because a collection of geometries contains another collection if each of the contained
|
|
|
+geometries is within at least one of the containing geometries. Evaluating this requires that the multi-value
|
|
|
+field is first combined into a single geometry before performing the predicate check.
|
|
|
+
|
|
|
+* `ST_CONTAINS(field, shape)` - true if the combined geometry contains the shape
|
|
|
+* `ST_WITHIN(field, shape)` - true if the combined geometry is within the shape
|
|
|
+
|
|
|
+{es-pull}112063[#112063]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[enhance_sort_push_down_to_lucene_to_cover_references_to_fields_st_distance_function]]
|
|
|
+=== Enhance SORT push-down to Lucene to cover references to fields and ST_DISTANCE function
|
|
|
+The most used and likely most valuable geospatial search query in Elasticsearch is the sorted proximity search,
|
|
|
+finding items within a certain distance of a point of interest and sorting the results by distance.
|
|
|
+This has been possible in ES|QL since 8.15.0, but the sorting was done in-memory, not pushed down to Lucene.
|
|
|
+Now the sorting is pushed down to Lucene, which results in a significant performance improvement.
|
|
|
+
|
|
|
+Queries that perform both filtering and sorting on distance are supported. For example:
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+FROM test
|
|
|
+| EVAL distance = ST_DISTANCE(location, TO_GEOPOINT("POINT(37.7749, -122.4194)"))
|
|
|
+| WHERE distance < 1000000
|
|
|
+| SORT distance ASC, name DESC
|
|
|
+| LIMIT 10
|
|
|
+----
|
|
|
+
|
|
|
+In addition, the support for sorting on EVAL expressions has been extended to cover references to fields:
|
|
|
+
|
|
|
+[source,esql]
|
|
|
+----
|
|
|
+FROM test
|
|
|
+| EVAL ref = field
|
|
|
+| SORT ref ASC
|
|
|
+| LIMIT 10
|
|
|
+----
|
|
|
+
|
|
|
+{es-pull}112938[#112938]
|
|
|
+
|
|
|
+[discrete]
|
|
|
+[[cross_cluster_search_telemetry]]
|
|
|
+=== Cross-cluster search telemetry
|
|
|
+The cross-cluster search telemetry is collected when cross-cluster searches
|
|
|
+are performed, and is returned as "ccs" field in `_cluster/stats` output.
|
|
|
+It also add a new parameter `include_remotes=true` to the `_cluster/stats` API
|
|
|
+which will collect data from connected remote clusters.
|
|
|
+
|
|
|
+{es-pull}113825[#113825]
|
|
|
+
|