| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134 |
- --
- :api: term-vectors
- :request: TermVectorsRequest
- :response: TermVectorsResponse
- --
- [id="{upid}-{api}"]
- === Term Vectors API
- Term Vectors API returns information and statistics on terms in the fields
- of a particular document. The document could be stored in the index or
- artificially provided by the user.
- [id="{upid}-{api}-request"]
- ==== Term Vectors Request
- A +{request}+ expects an `index`, a `type` and an `id` to specify
- a certain document, and fields for which the information is retrieved.
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-request]
- --------------------------------------------------
- Term vectors can also be generated for artificial documents, that is for
- documents not present in the index:
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-request-artificial]
- --------------------------------------------------
- <1> An artificial document is provided as an `XContentBuilder` object,
- the Elasticsearch built-in helper to generate JSON content.
- ===== Optional arguments
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-request-optional-arguments]
- --------------------------------------------------
- <1> Set `fieldStatistics` to `false` (default is `true`) to omit document count,
- sum of document frequencies, sum of total term frequencies.
- <2> Set `termStatistics` to `true` (default is `false`) to display
- total term frequency and document frequency.
- <3> Set `positions` to `false` (default is `true`) to omit the output of
- positions.
- <4> Set `offsets` to `false` (default is `true`) to omit the output of
- offsets.
- <5> Set `payloads` to `false` (default is `true`) to omit the output of
- payloads.
- <6> Set `filterSettings` to filter the terms that can be returned based
- on their tf-idf scores.
- <7> Set `perFieldAnalyzer` to specify a different analyzer than
- the one that the field has.
- <8> Set `realtime` to `false` (default is `true`) to retrieve term vectors
- near realtime.
- <9> Set a routing parameter
- include::../execution.asciidoc[]
- [id="{upid}-{api}-response"]
- ==== TermVectorsResponse
- The `TermVectorsResponse` contains the following information:
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-response]
- --------------------------------------------------
- <1> The index name of the document.
- <2> The type name of the document.
- <3> The id of the document.
- <4> Indicates whether or not the document found.
- ===== Inspecting Term Vectors
- If `TermVectorsResponse` contains non-null list of term vectors,
- more information about them can be obtained using following:
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-term-vectors]
- --------------------------------------------------
- <1> The list of `TermVector` for the document
- <2> The name of the current field
- <3> Fields statistics for the current field - document count
- <4> Fields statistics for the current field - sum of total term frequencies
- <5> Fields statistics for the current field - sum of document frequencies
- <6> Terms for the current field
- <7> The name of the term
- <8> Term frequency of the term
- <9> Document frequency of the term
- <10> Total term frequency of the term
- <11> Score of the term
- <12> Tokens of the term
- <13> Position of the token
- <14> Start offset of the token
- <15> End offset of the token
- <16> Payload of the token
- [id="{upid}-{api}-response"]
- ==== TermVectorsResponse
- The `TermVectorsResponse` contains the following information:
- ["source","java",subs="attributes,callouts,macros"]
- --------------------------------------------------
- include-tagged::{doc-tests-file}[{api}-response]
- --------------------------------------------------
- <1> The index name of the document.
- <2> The type name of the document.
- <3> The id of the document.
- <4> Indicates whether or not the document found.
- <5> Indicates whether or not there are term vectors for this document.
- <6> The list of `TermVector` for the document
- <7> The name of the current field
- <8> Fields statistics for the current field - document count
- <9> Fields statistics for the current field - sum of total term frequencies
- <10> Fields statistics for the current field - sum of document frequencies
- <11> Terms for the current field
- <12> The name of the term
- <13> Term frequency of the term
- <14> Document frequency of the term
- <15> Total term frequency of the term
- <16> Score of the term
- <17> Tokens of the term
- <18> Position of the token
- <19> Start offset of the token
- <20> End offset of the token
- <21> Payload of the token
|