Browse Source

[DOCS] Note that `trim` filter doesn't change offsets (#53220)

The [word delimiter graph token filter docs][0] note that the `trim`
filter changes the length of tokens without changing their offsets.

This explicitly mentions that in the `trim` filter docs.

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/analysis-word-delimiter-graph-tokenfilter.html
James Rodewig 5 years ago
parent
commit
10f9a8fd64
1 changed files with 6 additions and 3 deletions
  1. 6 3
      docs/reference/analysis/tokenfilters/trim-tokenfilter.asciidoc

+ 6 - 3
docs/reference/analysis/tokenfilters/trim-tokenfilter.asciidoc

@@ -4,7 +4,9 @@
 <titleabbrev>Trim</titleabbrev>
 ++++
 
-Removes leading and trailing whitespace from each token in a stream.
+Removes leading and trailing whitespace from each token in a stream. While this
+can change the length of a token, the `trim` filter does _not_ change a token's
+offsets.
 
 The `trim` filter uses Lucene's
 https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/TrimFilter.html[TrimFilter].
@@ -37,8 +39,9 @@ GET _analyze
 }
 ----
 
-The API returns the following response. Note the `" fox "` token contains
-the original text's whitespace.
+The API returns the following response. Note the `" fox "` token contains the
+original text's whitespace. Note that despite changing the token's length, the
+`start_offset` and `end_offset` remain the same.
 
 [source,console-result]
 ----