Browse Source

[Docs] Clarify behaviour of Pattern Capture Token Filter during search (#26278)

There was some confusion about the fact that tokens emitted from a Pattern
Capture Token Filter are treated as synonyms when used to analyze a search
query. This commit adds an explanation to the note in the docs to emphasize this
behaviour.

Closes #25746
Christoph Büscher 8 years ago
parent
commit
254c1b28e9

+ 6 - 4
docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc

@@ -131,10 +131,12 @@ Multiple patterns are required to allow overlapping captures, but also
 means that patterns are less dense and easier to understand.
 
 *Note:* All tokens are emitted in the same position, and with the same
-character offsets, so when combined with highlighting, the whole
-original token will be highlighted, not just the matching subset. For
-instance, querying the above email address for `"smith"` would
-highlight:
+character offsets. This means, for example, that a `match` query for
+`john-smith_123@foo-bar.com` that uses this analyzer will return documents
+containing any of these tokens, even when using the `and` operator.
+Also, when combined with highlighting, the whole original token will 
+be highlighted, not just the matching subset. For instance, querying 
+the above email address for `"smith"` would highlight:
 
 [source,html]
 --------------------------------------------------