| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778 | [[analysis-overview]]== Text analysis overview++++<titleabbrev>Overview</titleabbrev>++++Text analysis enables {es} to perform full-text search, where the search returnsall _relevant_ results rather than just exact matches.If you search for `Quick fox jumps`, you probably want the document thatcontains `A quick brown fox jumps over the lazy dog`, and you might also wantdocuments that contain related words like `fast fox` or `foxes leap`.[discrete][[tokenization]]=== TokenizationAnalysis makes full-text search possible through _tokenization_: breaking a textdown into smaller chunks, called _tokens_. In most cases, these tokens areindividual words.If you index the phrase `the quick brown fox jumps` as a single string and theuser searches for `quick fox`, it isn't considered a match. However, if youtokenize the phrase and index each word separately, the terms in the querystring can be looked up individually. This means they can be matched by searchesfor `quick fox`, `fox brown`, or other variations.[discrete][[normalization]]=== NormalizationTokenization enables matching on individual terms, but each token is stillmatched literally. This means:*  A search for `Quick` would not match `quick`, even though you likely wanteither term to match the other* Although `fox` and `foxes` share the same root word, a search for `foxes`would not match `fox` or vice versa.* A search for `jumps` would not match `leaps`. While they don't share a rootword, they are synonyms and have a similar meaning.To solve these problems, text analysis can _normalize_ these tokens into astandard format. This allows you to match tokens that are not exactly the sameas the search terms, but similar enough to still be relevant. For example:* `Quick` can be lowercased: `quick`.* `foxes` can be _stemmed_, or reduced to its root word: `fox`.* `jump` and `leap` are synonyms and can be indexed as a single word: `jump`.To ensure search terms match these words as intended, you can apply the sametokenization and normalization rules to the query string. For example, a searchfor `Foxes leap` can be normalized to a search for `fox jump`.[discrete][[analysis-customization]]=== Customize text analysisText analysis is performed by an <<analyzer-anatomy,_analyzer_>>, a set of rulesthat govern the entire process.{es} includes a default analyzer, called the<<analysis-standard-analyzer,standard analyzer>>, which works well for most usecases right out of the box.If you want to tailor your search experience, you can choose a different<<analysis-analyzers,built-in analyzer>> or even<<analysis-custom-analyzer,configure a custom one>>. A custom analyzer gives youcontrol over each step of the analysis process, including:* Changes to the text _before_ tokenization* How text is converted to tokens* Normalization changes made to tokens before indexing or search
 |