123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137 |
- [role="xpack"]
- [[rollup-search-limitations]]
- === {rollup-cap} search limitations
- experimental[]
- NOTE: For version 8.5 and above we recommend <<downsampling,downsampling>> over
- rollups as a way to reduce your storage costs for time series data.
- While we feel the Rollup function is extremely flexible, the nature of summarizing data means there will be some limitations. Once
- live data is thrown away, you will always lose some flexibility.
- This page highlights the major limitations so that you are aware of them.
- [discrete]
- ==== Only one {rollup} index per search
- When using the <<rollup-search>> endpoint, the `index` parameter accepts one or more indices. These can be a mix of regular, non-rollup
- indices and rollup indices. However, only one rollup index can be specified. The exact list of rules for the `index` parameter are as
- follows:
- - At least one index/index-pattern must be specified. This can be either a rollup or non-rollup index. Omitting the index parameter,
- or using `_all`, is not permitted
- - Multiple non-rollup indices may be specified
- - Only one rollup index may be specified. If more than one are supplied an exception will be thrown
- - Index patterns may be used, but if they match more than one rollup index an exception will be thrown.
- This limitation is driven by the logic that decides which jobs are the "best" for any given query. If you have ten jobs stored in a single
- index, which cover the source data with varying degrees of completeness and different intervals, the query needs to determine which set
- of jobs to actually search. Incorrect decisions can lead to inaccurate aggregation results (e.g. over-counting doc counts, or bad metrics).
- Needless to say, this is a technically challenging piece of code.
- To help simplify the problem, we have limited search to just one rollup index at a time (which may contain multiple jobs). In the future we
- may be able to open this up to multiple rollup jobs.
- [discrete]
- [[aggregate-stored-only]]
- ==== Can only aggregate what's been stored
- A perhaps obvious limitation, but rollups can only aggregate on data that has been stored in the rollups. If you don't configure the
- rollup job to store metrics about the `price` field, you won't be able to use the `price` field in any query or aggregation.
- For example, the `temperature` field in the following query has been stored in a rollup job... but not with an `avg` metric. Which means
- the usage of `avg` here is not allowed:
- [source,console]
- --------------------------------------------------
- GET sensor_rollup/_rollup_search
- {
- "size": 0,
- "aggregations": {
- "avg_temperature": {
- "avg": {
- "field": "temperature"
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[setup:sensor_prefab_data]
- // TEST[catch:/illegal_argument_exception/]
- The response will tell you that the field and aggregation were not possible, because no rollup jobs were found which contained them:
- [source,console-result]
- ----
- {
- "error": {
- "root_cause": [
- {
- "type": "illegal_argument_exception",
- "reason": "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
- "stack_trace": ...
- }
- ],
- "type": "illegal_argument_exception",
- "reason": "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
- "stack_trace": ...
- },
- "status": 400
- }
- ----
- // TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]
- [discrete]
- ==== Interval granularity
- Rollups are stored at a certain granularity, as defined by the `date_histogram` group in the configuration. This means you
- can only search/aggregate the rollup data with an interval that is greater-than or equal to the configured rollup interval.
- For example, if data is rolled up at hourly intervals, the <<rollup-search>> API can aggregate on any time interval
- hourly or greater. Intervals that are less than an hour will throw an exception, since the data simply doesn't
- exist for finer granularities.
- [[rollup-search-limitations-intervals]]
- .Requests must be multiples of the config
- **********************************
- Perhaps not immediately apparent, but the interval specified in an aggregation request must be a whole
- multiple of the configured interval. If the job was configured to rollup on `3d` intervals, you can only
- query and aggregate on multiples of three (`3d`, `6d`, `9d`, etc).
- A non-multiple wouldn't work, since the rolled up data wouldn't cleanly "overlap" with the buckets generated
- by the aggregation, leading to incorrect results.
- For that reason, an error is thrown if a whole multiple of the configured interval isn't found.
- **********************************
- Because the RollupSearch endpoint can "upsample" intervals, there is no need to configure jobs with multiple intervals (hourly, daily, etc).
- It's recommended to just configure a single job with the smallest granularity that is needed, and allow the search endpoint to upsample
- as needed.
- That said, if multiple jobs are present in a single rollup index with varying intervals, the search endpoint will identify and use the job(s)
- with the largest interval to satisfy the search request.
- [discrete]
- ==== Limited querying components
- The Rollup functionality allows `query`'s in the search request, but with a limited subset of components. The queries currently allowed are:
- - Term Query
- - Terms Query
- - Range Query
- - MatchAll Query
- - Any compound query (Boolean, Boosting, ConstantScore, etc)
- Furthermore, these queries can only use fields that were also saved in the rollup job as a `group`.
- If you wish to filter on a keyword `hostname` field, that field must have been configured in the rollup job under a `terms` grouping.
- If you attempt to use an unsupported query, or the query references a field that wasn't configured in the rollup job, an exception will be
- thrown. We expect the list of support queries to grow over time as more are implemented.
- [discrete]
- ==== Timezones
- Rollup documents are stored in the timezone of the `date_histogram` group configuration in the job. If no timezone is specified, the default
- is to rollup timestamps in `UTC`.
|