| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175 | [[analysis-index-search-time]]=== Index and search analysisText analysis occurs at two times:Index time::When a document is indexed, any <<text,`text`>> field values are analyzed.Search time::When running a <<full-text-queries,full-text search>> on a `text` field,the query string (the text the user is searching for) is analyzed.+Search time is also called _query time_.The analyzer, or set of analysis rules, used at each time is called the _indexanalyzer_ or _search analyzer_ respectively.[[analysis-same-index-search-analyzer]]==== How the index and search analyzer work togetherIn most cases, the same analyzer should be used at index and search time. Thisensures the values and query strings for a field are changed into the same formof tokens. In turn, this ensures the tokens match as expected during a search..**Example**[%collapsible]====A document is indexed with the following value in a `text` field:[source,text]------The QUICK brown foxes jumped over the dog!------The index analyzer for the field converts the value into tokens and normalizesthem. In this case, each of the tokens represents a word:[source,text]------[ quick, brown, fox, jump, over, dog ]------These tokens are then indexed.Later, a user searches the same `text` field for:[source,text]------"Quick fox"------The user expects this search to match the sentence indexed earlier,`The QUICK brown foxes jumped over the dog!`.However, the query string does not contain the exact words used in thedocument's original text:* `quick` vs `QUICK`* `fox` vs `foxes`To account for this, the query string is analyzed using the same analyzer. Thisanalyzer produces the following tokens:[source,text]------[ quick, fox ]------To execute the search, {es} compares these query string tokens to the tokensindexed in the `text` field.[options="header"]|===|Token     | Query string | `text` field|`quick`   | X            | X|`brown`   |              | X|`fox`     | X            | X|`jump`    |              | X|`over`    |              | X|`dog`     |              | X|===Because the field value are query string were analyzed in the same way, theycreated similar tokens. The tokens `quick` and `fox` are exact matches. Thismeans the search matches the document containing `"The QUICK brown foxes jumpedover the dog!"`, just as the user expects.====[[different-analyzers]]==== When to use a different search analyzerWhile less common, it sometimes makes sense to use different analyzers at indexand search time. To enable this, {es} allows you to<<specify-search-analyzer,specify a separate search analyzer>>.Generally, a separate search analyzer should only be specified when using thesame form of tokens for field values and query strings would create unexpectedor irrelevant search matches.[[different-analyzer-ex]].*Example*[%collapsible]===={es} is used to create a search engine that matches only words that start witha provided prefix. For instance, a search for `tr` should return `tram` or`trope`—but never `taxi` or `bat`.A document is added to the search engine's index; this document contains onesuch word in a `text` field:[source,text]------"Apple"------The index analyzer for the field converts the value into tokens and normalizesthem. In this case, each of the tokens represents a potential prefix forthe word:[source,text]------[ a, ap, app, appl, apple]------These tokens are then indexed.Later, a user searches the same `text` field for:[source,text]------"appli"------The user expects this search to match only words that start with `appli`,such as `appliance` or `application`. The search should not match `apple`.However, if the index analyzer is used to analyze this query string, it wouldproduce the following tokens:[source,text]------[ a, ap, app, appl, appli ]------When {es} compares these query string tokens to the ones indexed for `apple`,it finds several matches.[options="header"]|===|Token      | `appli`      | `apple`|`a`        | X            | X|`ap`       | X            | X|`app`      | X            | X|`appl`     | X            | X|`appli`    |              | X|===This means the search would erroneously match `apple`. Not only that, it wouldmatch any word starting with `a`.To fix this, you can specify a different search analyzer for query strings usedon the `text` field.In this case, you could specify a search analyzer that produces a single tokenrather than a set of prefixes:[source,text]------[ appli ]------This query string token would only match tokens for words that start with`appli`, which better aligns with the user's search expectations.====
 |