1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162 |
- [[analysis]]
- = Text analysis
- :lucene-analysis-docs: https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis
- :lucene-gh-main-link: https://github.com/apache/lucene/blob/main/lucene
- :lucene-stop-word-link: {lucene-gh-main-link}/analysis/common/src/resources/org/apache/lucene/analysis
- [partintro]
- --
- _Text analysis_ is the process of converting unstructured text, like
- the body of an email or a product description, into a structured format that's
- optimized for search.
- [discrete]
- [[when-to-configure-analysis]]
- === When to configure text analysis
- {es} performs text analysis when indexing or searching <<text,`text`>> fields.
- If your index doesn't contain `text` fields, no further setup is needed; you can
- skip the pages in this section.
- However, if you use `text` fields or your text searches aren't returning results
- as expected, configuring text analysis can often help. You should also look into
- analysis configuration if you're using {es} to:
- * Build a search engine
- * Mine unstructured data
- * Fine-tune search for a specific language
- * Perform lexicographic or linguistic research
- [discrete]
- [[analysis-toc]]
- === In this section
- * <<analysis-overview>>
- * <<analysis-concepts>>
- * <<configure-text-analysis>>
- * <<analysis-analyzers>>
- * <<analysis-tokenizers>>
- * <<analysis-tokenfilters>>
- * <<analysis-charfilters>>
- * <<analysis-normalizers>>
- --
- include::analysis/overview.asciidoc[]
- include::analysis/concepts.asciidoc[]
- include::analysis/configure-text-analysis.asciidoc[]
- include::analysis/analyzers.asciidoc[]
- include::analysis/tokenizers.asciidoc[]
- include::analysis/tokenfilters.asciidoc[]
- include::analysis/charfilters.asciidoc[]
- include::analysis/normalizers.asciidoc[]
|