123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309 |
- [role="xpack"]
- [[transform-limitations]]
- = {transform-cap} limitations
- [subs="attributes"]
- ++++
- <titleabbrev>Limitations</titleabbrev>
- ++++
- The following limitations and known problems apply to the {version} release of
- the Elastic {transform} feature. The limitations are grouped into the following
- categories:
- * <<transform-config-limitations>> apply to the configuration process of the
- {transforms}.
- * <<transform-operational-limitations>> affect the behavior of the {transforms}
- that are running.
- * <<transform-ui-limitations>> only apply to {transforms} managed via the user
- interface.
- [discrete]
- [[transform-config-limitations]]
- == Configuration limitations
- [discrete]
- [[transforms-underscore-limitation]]
- === Field names prefixed with underscores are omitted from latest {transforms}
- If you use the `latest` type of {transform} and the source index has field names
- that start with an underscore (_) character, they are assumed to be internal
- fields. Those fields are omitted from the documents in the destination index.
- [discrete]
- [[transforms-ccs-limitation]]
- === {transforms-cap} support {ccs} if the remote cluster is configured properly
- If you use <<modules-cross-cluster-search,{ccs}>>, the remote cluster must
- support the search and aggregations you use in your {transforms}.
- {transforms-cap} validate their configuration; if you use {ccs} and the
- validation fails, make sure that the remote cluster supports the query and
- aggregations you use.
- [discrete]
- [[transform-painless-limitation]]
- === Using scripts in {transforms}
- {transforms-cap} support scripting in every case when aggregations support them.
- However, there are certain factors you might want to consider when using scripts
- in {transforms}:
- * {transforms-cap} cannot deduce index mappings for output fields when the
- fields are created by a script. In this case, you might want to create the
- mappings of the destination index yourself prior to creating the transform.
- * Scripted fields may increase the runtime of the {transform}.
-
- * {transforms-cap} cannot optimize queries when you use scripts for all the
- groupings defined in `group_by`, you will receive a warning message when you
- use scripts this way.
- [discrete]
- [[transform-painless-warning-limitation]]
- === Deprecation warnings for Painless scripts in {transforms}
- If a {transform} contains Painless scripts that use deprecated syntax,
- deprecation warnings are displayed when the {transform} is previewed or started.
- However, it is not possible to check for deprecation warnings across all
- {transforms} as a bulk action because running the required queries might be a
- resource intensive process. Therefore any deprecation warnings due to deprecated
- Painless syntax are not available in the Upgrade Assistant.
- [discrete]
- [[transform-runtime-field-limitation]]
- === {transforms-cap} perform better on indexed fields
- {transforms-cap} sort data by a user-defined time field, which is frequently
- accessed. If the time field is a {ref}/runtime.html[runtime field], the
- performance impact of calculating field values at query time can significantly
- slow the {transform}. Use an indexed field as a time field when using
- {transforms}.
- [discrete]
- [[transform-scheduling-limitations]]
- === {ctransform-cap} scheduling limitations
- A {ctransform} periodically checks for changes to source data. The functionality
- of the scheduler is currently limited to a basic periodic timer which can be
- within the `frequency` range from 1s to 1h. The default is 1m. This is designed
- to run little and often. When choosing a `frequency` for this timer consider
- your ingest rate along with the impact that the {transform}
- search/index operations has other users in your cluster. Also note that retries
- occur at `frequency` interval.
- [discrete]
- [[transform-operational-limitations]]
- == Operational limitations
- [discrete]
- [[transform-aggresponse-limitations]]
- === Aggregation responses may be incompatible with destination index mappings
- When a pivot {transform} is first started, it will deduce the mappings
- required for the destination index. This process is based on the field types of
- the source index and the aggregations used. If the fields are derived from
- <<search-aggregations-metrics-scripted-metric-aggregation,`scripted_metrics`>>
- or <<search-aggregations-pipeline-bucket-script-aggregation,`bucket_scripts`>>,
- <<dynamic-mapping,dynamic mappings>> will be used. In some instances the
- deduced mappings may be incompatible with the actual data. For example, numeric
- overflows might occur or dynamically mapped fields might contain both numbers
- and strings. Please check {es} logs if you think this may have occurred.
- You can view the deduced mappings by using the
- <<preview-transform,preview transform API>>.
- See the `generated_dest_index` object in the API response.
- If it's required, you may define custom mappings prior to starting the
- {transform} by creating a custom destination index using the
- <<indices-create-index,create index API>>.
- As deduced mappings cannot be overwritten by an index template, use the create
- index API to define custom mappings. The index templates only apply to fields
- derived from scripts that use dynamic mappings.
- [discrete]
- [[transform-batch-limitations]]
- === Batch {transforms} may not account for changed documents
- A batch {transform} uses a
- <<search-aggregations-bucket-composite-aggregation,composite aggregation>>
- which allows efficient pagination through all buckets. Composite aggregations
- do not yet support a search context, therefore if the source data is changed
- (deleted, updated, added) while the batch {dataframe} is in progress, then the
- results may not include these changes.
- [discrete]
- [[transform-consistency-limitations]]
- === {ctransform-cap} consistency does not account for deleted or updated documents
- While the process for {transforms} allows the continual recalculation of the
- {transform} as new data is being ingested, it does also have some limitations.
- Changed entities will only be identified if their time field has also been
- updated and falls within the range of the action to check for changes. This has
- been designed in principle for, and is suited to, the use case where new data is
- given a timestamp for the time of ingest.
- If the indices that fall within the scope of the source index pattern are
- removed, for example when deleting historical time-based indices, then the
- composite aggregation performed in consecutive checkpoint processing will search
- over different source data, and entities that only existed in the deleted index
- will not be removed from the {dataframe} destination index.
- Depending on your use case, you may wish to recreate the {transform} entirely
- after deletions. Alternatively, if your use case is tolerant to historical
- archiving, you may wish to include a max ingest timestamp in your aggregation.
- This will allow you to exclude results that have not been recently updated when
- viewing the destination index.
- [discrete]
- [[transform-deletion-limitations]]
- === Deleting a {transform} does not delete the destination index or {kib} index pattern
- When deleting a {transform} using `DELETE _transform/index`
- neither the destination index nor the {kib} index pattern, should one have been
- created, are deleted. These objects must be deleted separately.
- [discrete]
- [[transform-aggregation-page-limitations]]
- === Handling dynamic adjustment of aggregation page size
- During the development of {transforms}, control was favoured over performance.
- In the design considerations, it is preferred for the {transform} to take longer
- to complete quietly in the background rather than to finish quickly and take
- precedence in resource consumption.
- Composite aggregations are well suited for high cardinality data enabling
- pagination through results. If a <<circuit-breaker,circuit breaker>> memory
- exception occurs when performing the composite aggregated search then we try
- again reducing the number of buckets requested. This circuit breaker is
- calculated based upon all activity within the cluster, not just activity from
- {transforms}, so it therefore may only be a temporary resource
- availability issue.
- For a batch {transform}, the number of buckets requested is only ever adjusted
- downwards. The lowering of value may result in a longer duration for the
- {transform} checkpoint to complete. For {ctransforms}, the number of buckets
- requested is reset back to its default at the start of every checkpoint and it
- is possible for circuit breaker exceptions to occur repeatedly in the {es} logs.
- The {transform} retrieves data in batches which means it calculates several
- buckets at once. Per default this is 500 buckets per search/index operation. The
- default can be changed using `max_page_search_size` and the minimum value is 10.
- If failures still occur once the number of buckets requested has been reduced to
- its minimum, then the {transform} will be set to a failed state.
- [discrete]
- [[transform-dynamic-adjustments-limitations]]
- === Handling dynamic adjustments for many terms
- For each checkpoint, entities are identified that have changed since the last
- time the check was performed. This list of changed entities is supplied as a
- <<query-dsl-terms-query,terms query>> to the {transform} composite aggregation,
- one page at a time. Then updates are applied to the destination index for each
- page of entities.
- The page `size` is defined by `max_page_search_size` which is also used to
- define the number of buckets returned by the composite aggregation search. The
- default value is 500, the minimum is 10.
- The index setting <<dynamic-index-settings,`index.max_terms_count`>> defines
- the maximum number of terms that can be used in a terms query. The default value
- is 65536. If `max_page_search_size` exceeds `index.max_terms_count` the
- {transform} will fail.
- Using smaller values for `max_page_search_size` may result in a longer duration
- for the {transform} checkpoint to complete.
- [discrete]
- [[transform-failed-limitations]]
- === Handling of failed {transforms}
- Failed {transforms} remain as a persistent task and should be handled
- appropriately, either by deleting it or by resolving the root cause of the
- failure and re-starting.
- When using the API to delete a failed {transform}, first stop it using
- `_stop?force=true`, then delete it.
- [discrete]
- [[transform-availability-limitations]]
- === {ctransforms-cap} may give incorrect results if documents are not yet available to search
- After a document is indexed, there is a very small delay until it is available
- to search.
- A {ctransform} periodically checks for changed entities between the time since
- it last checked and `now` minus `sync.time.delay`. This time window moves
- without overlapping. If the timestamp of a recently indexed document falls
- within this time window but this document is not yet available to search then
- this entity will not be updated.
- If using a `sync.time.field` that represents the data ingest time and using a
- zero second or very small `sync.time.delay`, then it is more likely that this
- issue will occur.
- [discrete]
- [[transform-date-nanos]]
- === Support for date nanoseconds data type
- If your data uses the <<date_nanos,date nanosecond data type>>, aggregations
- are nonetheless on millisecond resolution. This limitation also affects the
- aggregations in your {transforms}.
- [discrete]
- [[transform-data-streams-destination]]
- === Data streams as destination indices are not supported
- {transforms-cap} update data in the destination index which requires writing
- into the destination. <<data-streams>> are designed to be append-only, which
- means you cannot send update or delete requests directly to a data stream. For
- this reason, data streams are not supported as destination indices for
- {transforms}.
- [discrete]
- [[transform-ilm-destination]]
- === ILM as destination index may cause duplicated documents
- <<index-lifecycle-management,ILM>> is not recommended to use as a {transform}
- destination index. {transforms-cap} update documents in the current destination,
- and cannot delete documents in the indices previously used by ILM. This may lead
- to duplicated documents when you use {transforms} combined with ILM in case of a
- rollover.
- If you use ILM to have time-based indices, please consider using the
- <<date-index-name-processor>> instead. The processor works without duplicated
- documents if your {transform} contains a `group_by` based on `date_histogram`.
- [discrete]
- [[transform-ui-limitations]]
- == Limitations in {kib}
- [discrete]
- [[transform-space-limitations]]
- === {transforms-cap} are visible in all {kib} spaces
- {kibana-ref}/xpack-spaces.html[Spaces] enable you to organize your source and
- destination indices and other saved objects in {kib} and to see only the objects
- that belong to your space. However, a {transform} is a long running task which
- is managed on cluster level and therefore not limited in scope to certain
- spaces. Space awareness can be implemented for a {data-source} under
- **Stack Management > Kibana** which allows privileges to the {transform}
- destination index.
- [discrete]
- [[transform-kibana-limitations]]
- === Up to 1,000 {transforms} are listed in {kib}
- The {transforms} management page in {kib} lists up to 1000 {transforms}.
- [discrete]
- [[transform-ui-support]]
- === {kib} might not support every {transform} configuration option
- There might be configuration options available via the {transform} APIs that are
- not supported in {kib}. For an exhaustive list of configuration options, refer
- to the <<transform-apis,documentation>>.
|