1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798 |
- [[analysis-phonetic]]
- === Phonetic Analysis Plugin
- The Phonetic Analysis plugin provides token filters which convert tokens to
- their phonetic representation using Soundex, Metaphone, and a variety of other
- algorithms.
- :plugin_name: analysis-phonetic
- include::install_remove.asciidoc[]
- [[analysis-phonetic-token-filter]]
- ==== `phonetic` token filter
- The `phonetic` token filter takes the following settings:
- `encoder`::
- Which phonetic encoder to use. Accepts `metaphone` (default),
- `double_metaphone`, `soundex`, `refined_soundex`, `caverphone1`,
- `caverphone2`, `cologne`, `nysiis`, `koelnerphonetik`, `haasephonetik`,
- `beider_morse`, `daitch_mokotoff`.
- `replace`::
- Whether or not the original token should be replaced by the phonetic
- token. Accepts `true` (default) and `false`. Not supported by
- `beider_morse` encoding.
- [source,js]
- --------------------------------------------------
- PUT phonetic_sample
- {
- "settings": {
- "index": {
- "analysis": {
- "analyzer": {
- "my_analyzer": {
- "tokenizer": "standard",
- "filter": [
- "lowercase",
- "my_metaphone"
- ]
- }
- },
- "filter": {
- "my_metaphone": {
- "type": "phonetic",
- "encoder": "metaphone",
- "replace": false
- }
- }
- }
- }
- }
- }
- GET phonetic_sample/_analyze
- {
- "analyzer": "my_analyzer",
- "text": "Joe Bloggs" <1>
- }
- --------------------------------------------------
- // CONSOLE
- <1> Returns: `J`, `joe`, `BLKS`, `bloggs`
- [float]
- ===== Double metaphone settings
- If the `double_metaphone` encoder is used, then this additional setting is
- supported:
- `max_code_len`::
- The maximum length of the emitted metaphone token. Defaults to `4`.
- [float]
- ===== Beider Morse settings
- If the `beider_morse` encoder is used, then these additional settings are
- supported:
- `rule_type`::
- Whether matching should be `exact` or `approx` (default).
- `name_type`::
- Whether names are `ashkenazi`, `sephardic`, or `generic` (default).
- `languageset`::
- An array of languages to check. If not specified, then the language will
- be guessed. Accepts: `any`, `common`, `cyrillic`, `english`, `french`,
- `german`, `hebrew`, `hungarian`, `polish`, `romanian`, `russian`,
- `spanish`.
|