1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859 |
- [[analysis-keyword-tokenizer]]
- === Keyword Tokenizer
- The `keyword` tokenizer is a ``noop'' tokenizer that accepts whatever text it
- is given and outputs the exact same text as a single term. It can be combined
- with token filters to normalise output, e.g. lower-casing email addresses.
- [float]
- === Example output
- [source,js]
- ---------------------------
- POST _analyze
- {
- "tokenizer": "keyword",
- "text": "New York"
- }
- ---------------------------
- // CONSOLE
- /////////////////////
- [source,console-result]
- ----------------------------
- {
- "tokens": [
- {
- "token": "New York",
- "start_offset": 0,
- "end_offset": 8,
- "type": "word",
- "position": 0
- }
- ]
- }
- ----------------------------
- /////////////////////
- The above sentence would produce the following term:
- [source,text]
- ---------------------------
- [ New York ]
- ---------------------------
- [float]
- === Configuration
- The `keyword` tokenizer accepts the following parameters:
- [horizontal]
- `buffer_size`::
- The number of characters read into the term buffer in a single pass.
- Defaults to `256`. The term buffer will grow by this size until all the
- text has been consumed. It is advisable not to change this setting.
|