| 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677 | [[analysis-chargroup-tokenizer]]=== Char Group TokenizerThe `char_group` tokenizer breaks text into terms whenever it encounters acharacter which is in a defined set. It is mostly useful for cases where a simplecustom tokenization is desired, and the overhead of use of the <<analysis-pattern-tokenizer, `pattern` tokenizer>>is not acceptable.[float]=== ConfigurationThe `char_group` tokenizer accepts one parameter:[horizontal]`tokenize_on_chars`::    A list containing a list of characters to tokenize the string on. Whenever a character     from this list is encountered, a new token is started. This accepts either single    characters like e.g. `-`, or character groups: `whitespace`, `letter`, `digit`,    `punctuation`, `symbol`.[float]=== Example output[source,console]---------------------------POST _analyze{  "tokenizer": {    "type": "char_group",    "tokenize_on_chars": [      "whitespace",      "-",      "\n"    ]  },  "text": "The QUICK brown-fox"}---------------------------returns[source,console-result]---------------------------{  "tokens": [    {      "token": "The",      "start_offset": 0,      "end_offset": 3,      "type": "word",      "position": 0    },    {      "token": "QUICK",      "start_offset": 4,      "end_offset": 9,      "type": "word",      "position": 1    },    {      "token": "brown",      "start_offset": 10,      "end_offset": 15,      "type": "word",      "position": 2    },    {      "token": "fox",      "start_offset": 16,      "end_offset": 19,      "type": "word",      "position": 3    }  ]}---------------------------
 |