Browse Source

[Docs] Clarify caveats for phonetic filters replace option (#42807)

The `replace` option in the phonetic token filter can have suprising side
effects, e.g. such as described in #26921. This PR adds a note to be mindful
about such scenarios and offers alternatives to using the `replace` option.

Closes #26921
Christoph Büscher 6 years ago
parent
commit
ffc5534584

+ 8 - 0
docs/plugins/analysis-phonetic.asciidoc

@@ -65,6 +65,14 @@ GET phonetic_sample/_analyze
 
 <1> Returns: `J`, `joe`, `BLKS`, `bloggs`
 
+It is important to note that `"replace": false` can lead to unexpected behavior since
+the original and the phonetically analyzed version are both kept at the same token position.
+Some queries handle these stacked tokens in special ways. For example, the fuzzy `match`
+query does not apply {ref}/common-options.html#fuzziness[fuzziness] to stacked synonym tokens.
+This can lead to issues that are difficult to diagnose and reason about. For this reason, it
+is often beneficial to use separate fields for analysis with and without phonetic filtering.
+That way searches can be run against both fields with differing boosts and trade-offs (e.g.
+only run a fuzzy `match` query on the original text field, but not on the phonetic version).
 
 [float]
 ===== Double metaphone settings

+ 2 - 1
docs/reference/query-dsl/match-query.asciidoc

@@ -75,7 +75,8 @@ rewritten.
 Fuzzy transpositions (`ab` -> `ba`) are allowed by default but can be disabled
 by setting `fuzzy_transpositions` to `false`.
 
-Note that fuzzy matching is not applied to terms with synonyms, as under the hood
+NOTE: Fuzzy matching is not applied to terms with synonyms or in cases where the
+analysis process produces multiple tokens at the same position. Under the hood
 these terms are expanded to a special synonym query that blends term frequencies,
 which does not support fuzzy expansion.