| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104 | [[token-graphs]]=== Token graphsWhen a <<analyzer-anatomy-tokenizer,tokenizer>> converts a text into a stream oftokens, it also records the following:* The `position` of each token in the stream* The `positionLength`, the number of positions that a token spansUsing these, you can create ahttps://en.wikipedia.org/wiki/Directed_acyclic_graph[directed acyclic graph],called a _token graph_, for a stream. In a token graph, each position representsa node. Each token represents an edge or arc, pointing to the next position.image::images/analysis/token-graph-qbf-ex.svg[align="center"][[token-graphs-synonyms]]==== SynonymsSome <<analyzer-anatomy-token-filters,token filters>> can add new tokens, likesynonyms, to an existing token stream. These synonyms often span the samepositions as existing tokens.In the following graph, `quick` and its synonym `fast` both have a position of`0`. They span the same positions.image::images/analysis/token-graph-qbf-synonym-ex.svg[align="center"][[token-graphs-multi-position-tokens]]==== Multi-position tokensSome token filters can add tokens that span multiple positions. These caninclude tokens for multi-word synonyms, such as using "atm" as a synonym for"automatic teller machine."However, only some token filters, known as _graph token filters_, accuratelyrecord the `positionLength` for multi-position tokens. This filters include:* <<analysis-synonym-graph-tokenfilter,`synonym_graph`>>* <<analysis-word-delimiter-graph-tokenfilter,`word_delimiter_graph`>>In the following graph, `domain name system` and its synonym, `dns`, both have aposition of `0`. However, `dns` has a `positionLength` of `3`. Other tokens inthe graph have a default `positionLength` of `1`.image::images/analysis/token-graph-dns-synonym-ex.svg[align="center"][[token-graphs-token-graphs-search]]===== Using token graphs for search <<analysis-index-search-time,Indexing>> ignores the `positionLength` attributeand does not support token graphs containing multi-position tokens.However, queries, such as the <<query-dsl-match-query,`match`>> or<<query-dsl-match-query-phrase,`match_phrase`>> query, can use these graphs togenerate multiple sub-queries from a single query string..*Example*[%collapsible]====A user runs a search for the following phrase using the `match_phrase` query:`domain name system is fragile`During <<analysis-index-search-time,search analysis>>, `dns`, a synonym for`domain name system`, is added to the query string's token stream. The `dns`token has a `positionLength` of `3`.image::images/analysis/token-graph-dns-synonym-ex.svg[align="center"]The `match_phrase` query uses this graph to generate sub-queries for thefollowing phrases:[source,text]------dns is fragiledomain name system is fragile------This means the query matches documents containing either `dns is fragile` _or_`domain name system is fragile`.====[[token-graphs-invalid-token-graphs]]===== Invalid token graphsThe following token filters can add tokens that span multiple positions butonly record a default `positionLength` of `1`:* <<analysis-synonym-tokenfilter,`synonym`>>* <<analysis-word-delimiter-tokenfilter,`word_delimiter`>>This means these filters will produce invalid token graphs for streamscontaining such tokens.In the following graph, `dns` is a multi-position synonym for `domain namesystem`. However, `dns` has the default `positionLength` value of `1`, resultingin an invalid graph.image::images/analysis/token-graph-dns-invalid-ex.svg[align="center"]Avoid using invalid token graphs for search. Invalid graphs can cause unexpectedsearch results.
 |