analysis-smartcn.asciidoc 764 B

1234567891011121314151617181920212223
  1. [[analysis-smartcn]]
  2. === Smart Chinese Analysis Plugin
  3. The Smart Chinese Analysis plugin integrates Lucene's Smart Chinese analysis
  4. module into elasticsearch.
  5. It provides an analyzer for Chinese or mixed Chinese-English text. This
  6. analyzer uses probabilistic knowledge to find the optimal word segmentation
  7. for Simplified Chinese text. The text is first broken into sentences, then
  8. each sentence is segmented into words.
  9. :plugin_name: analysis-smartcn
  10. include::install_remove.asciidoc[]
  11. [[analysis-smartcn-tokenizer]]
  12. [float]
  13. ==== `smartcn` tokenizer and token filter
  14. The plugin provides the `smartcn` analyzer and `smartcn_tokenizer` tokenizer,
  15. which are not configurable.
  16. NOTE: The `smartcn_word` token filter and `smartcn_sentence` have been deprecated.