| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181 | [role="xpack"][testenv="basic"][[sql-limitations]]== SQL Limitations[float][[large-parsing-trees]]=== Large queries may throw `ParsingExpection`Extremely large queries can consume too much memory during the parsing phase, in which case the {es-sql} engine willabort parsing and throw an error. In such cases, consider reducing the query to a smaller size by potentiallysimplifying it or splitting it into smaller queries.[float][[sys-columns-describe-table-nested-fields]]=== Nested fields in `SYS COLUMNS` and `DESCRIBE TABLE`{es} has a special type of relationship fields called `nested` fields. In {es-sql} they can be used by referencing their innersub-fields. Even though `SYS COLUMNS` in non-driver mode (in the CLI and in REST calls) and `DESCRIBE TABLE` will still displaythem as having the type `NESTED`, they cannot be used in a query. One can only reference its sub-fields in the form:[source, sql]--------------------------------------------------[nested_field_name].[sub_field_name]--------------------------------------------------For example:[source, sql]--------------------------------------------------SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages;--------------------------------------------------[float]=== Multi-nested fields{es-sql} doesn't support multi-nested documents, so a query cannot reference more than one nested field in an index.This applies to multi-level nested fields, but also multiple nested fields defined on the same level. For example, for this index:[source, sql]----------------------------------------------------       column         |     type      |    mapping----------------------+---------------+-------------nested_A              |STRUCT         |NESTEDnested_A.nested_X     |STRUCT         |NESTEDnested_A.nested_X.text|VARCHAR        |KEYWORDnested_A.text         |VARCHAR        |KEYWORDnested_B              |STRUCT         |NESTEDnested_B.text         |VARCHAR        |KEYWORD----------------------------------------------------`nested_A` and `nested_B` cannot be used at the same time, nor `nested_A`/`nested_B` and `nested_A.nested_X` combination.For such situations, {es-sql} will display an error message.[float]=== Paginating nested inner hitsWhen SELECTing a nested field, pagination will not work as expected, {es-sql} will return __at least__ the page size records. This is because of the way nested queries work in {es}: the root nested field will be returned and it's matching inner nested fields as well,pagination taking place on the **root nested document and not on its inner hits**.[float][[normalized-keyword-fields]]=== Normalized `keyword` fields`keyword` fields in {es} can be normalized by defining a `normalizer`. Such fields are not supported in {es-sql}.[float]=== Array type of fieldsArray fields are not supported due to the "invisible" way in which {es} handles an array of values: the mapping doesn't indicate whethera field is an array (has multiple values) or not, so without reading all the data, {es-sql} cannot know whether a field is a single or multi value.When multiple values are returned for a field, by default, {es-sql} will throw an exception. However, it is possible to change this behavior through `field_multi_value_leniency` parameter in REST (disabled by default) or`field.multi.value.leniency` in drivers (enabled by default).[float]=== Sorting by aggregationWhen doing aggregations (`GROUP BY`) {es-sql} relies on {es}'s `composite` aggregation for its support for paginating results.However this type of aggregation does come with a limitation: sorting can only be applied on the key used for the aggregation's buckets. {es-sql} overcomes this limitation by doing client-side sorting however as a safety measure, allows only up to *512* rows.It is recommended to use `LIMIT` for queries that use sorting by aggregation, essentially indicating the top N results that are desired:[source, sql]--------------------------------------------------SELECT * FROM test GROUP BY age ORDER BY COUNT(*) LIMIT 100;--------------------------------------------------It is possible to run the same queries without a `LIMIT` however in that case if the maximum size (*10000*) is passed,an exception will be returned as {es-sql} is unable to track (and sort) all the results returned.[float]=== Using aggregation functions on top of scalar functionsAggregation functions like <<sql-functions-aggs-min,`MIN`>>, <<sql-functions-aggs-max,`MAX`>>, etc. can only be useddirectly on fields, and so queries like `SELECT MAX(abs(age)) FROM test` are not possible.[float]=== Using a sub-selectUsing sub-selects (`SELECT X FROM (SELECT Y)`) is **supported to a small degree**: any sub-select that can be "flattened" into a single`SELECT` is possible with {es-sql}. For example:["source","sql",subs="attributes,macros"]--------------------------------------------------include-tagged::{sql-specs}/docs/docs.csv-spec[limitationSubSelect]--------------------------------------------------The query above is possible because it is equivalent with:["source","sql",subs="attributes,macros"]--------------------------------------------------include-tagged::{sql-specs}/docs/docs.csv-spec[limitationSubSelectRewritten]--------------------------------------------------But, if the sub-select would include a `GROUP BY` or `HAVING` or the enclosing `SELECT` would be more complex than `SELECT XFROM (SELECT ...) WHERE [simple_condition]`, this is currently **un-supported**.[float][[first-last-agg-functions-having-clause]]=== Using <<sql-functions-aggs-first, `FIRST`>>/<<sql-functions-aggs-last,`LAST`>> aggregation functions in `HAVING` clauseUsing `FIRST` and `LAST` in the `HAVING` clause is not supported. The same applies to<<sql-functions-aggs-min,`MIN`>> and <<sql-functions-aggs-max,`MAX`>> when their target columnis of type <<keyword, `keyword`>> as they are internally translated to `FIRST` and `LAST`.[float][[group-by-time]]=== Using TIME data type in GROUP BY or <<sql-functions-grouping-histogram>>Using `TIME` data type as a grouping key is currently not supported. For example:[source, sql]-------------------------------------------------------------SELECT count(*) FROM test GROUP BY CAST(date_created AS TIME);-------------------------------------------------------------On the other hand, it can still be used if it's wrapped with a scalar function that returns another data type,for example:[source, sql]-------------------------------------------------------------SELECT count(*) FROM test GROUP BY MINUTE((CAST(date_created AS TIME));-------------------------------------------------------------`TIME` data type is also currently not supported in histogram grouping function. For example:[source, sql]-------------------------------------------------------------SELECT HISTOGRAM(CAST(birth_date AS TIME), INTERVAL '10' MINUTES) as h, COUNT(*) FROM t GROUP BY h-------------------------------------------------------------[float][[geo-sql-limitations]]=== Geo-related functionsSince `geo_shape` fields don't have doc values these fields cannot be used for filtering, grouping or sorting.By default,`geo_points` fields are indexed and have doc values. However only latitude and longitude are stored andindexed with some loss of precision from the original values (4.190951585769653E-8 for the latitude and8.381903171539307E-8 for longitude). The altitude component is accepted but not stored in doc values nor indexed.Therefore calling `ST_Z` function in the filtering, grouping or sorting will return `null`.[float][[fields-from-source]]=== Retrieving from `_source`Most of {es-sql}'s columns are retrieved from the document's `_source` and there is no attempt to get the columns content from`docvalue_fields` not even in the case <<mapping-source-field,`_source`>> field is disabled in the mapping explicitly.If a column, for which there is no source stored, is asked for in a query, {es-sql} will not return it. Field types that don't followthis restriction are: `keyword`, `date`, `scaled_float`, `geo_point`, `geo_shape` since they are NOT returned from `_source` butfrom `docvalue_fields`.[float][[fields-from-docvalues]]=== Retrieving from `docvalue_fields`When the number of columns retrieveable from `docvalue_fields` is greater than the configured <<dynamic-index-settings,`index.max_docvalue_fields_search` setting>>the query will fail with `IllegalArgumentException: Trying to retrieve too many docvalue_fields` error. Either the mentioned {es}setting needs to be adjusted or fewer columns retrieveable from `docvalue_fields` need to be selected.
 |