term-vectors.asciidoc 3.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
  1. --
  2. :api: term-vectors
  3. :request: TermVectorsRequest
  4. :response: TermVectorsResponse
  5. --
  6. [id="{upid}-{api}"]
  7. === Term Vectors API
  8. Term Vectors API returns information and statistics on terms in the fields
  9. of a particular document. The document could be stored in the index or
  10. artificially provided by the user.
  11. [id="{upid}-{api}-request"]
  12. ==== Term Vectors Request
  13. A +{request}+ expects an `index` and an `id` to specify
  14. a certain document, and fields for which the information is retrieved.
  15. ["source","java",subs="attributes,callouts,macros"]
  16. --------------------------------------------------
  17. include-tagged::{doc-tests-file}[{api}-request]
  18. --------------------------------------------------
  19. Term vectors can also be generated for artificial documents, that is for
  20. documents not present in the index:
  21. ["source","java",subs="attributes,callouts,macros"]
  22. --------------------------------------------------
  23. include-tagged::{doc-tests-file}[{api}-request-artificial]
  24. --------------------------------------------------
  25. <1> An artificial document is provided as an `XContentBuilder` object,
  26. the Elasticsearch built-in helper to generate JSON content.
  27. ===== Optional arguments
  28. ["source","java",subs="attributes,callouts,macros"]
  29. --------------------------------------------------
  30. include-tagged::{doc-tests-file}[{api}-request-optional-arguments]
  31. --------------------------------------------------
  32. <1> Set `fieldStatistics` to `false` (default is `true`) to omit document count,
  33. sum of document frequencies, sum of total term frequencies.
  34. <2> Set `termStatistics` to `true` (default is `false`) to display
  35. total term frequency and document frequency.
  36. <3> Set `positions` to `false` (default is `true`) to omit the output of
  37. positions.
  38. <4> Set `offsets` to `false` (default is `true`) to omit the output of
  39. offsets.
  40. <5> Set `payloads` to `false` (default is `true`) to omit the output of
  41. payloads.
  42. <6> Set `filterSettings` to filter the terms that can be returned based
  43. on their tf-idf scores.
  44. <7> Set `perFieldAnalyzer` to specify a different analyzer than
  45. the one that the field has.
  46. <8> Set `realtime` to `false` (default is `true`) to retrieve term vectors
  47. near realtime.
  48. <9> Set a routing parameter
  49. include::../execution.asciidoc[]
  50. [id="{upid}-{api}-response"]
  51. ==== Term Vectors Response
  52. +{response}+ contains the following information:
  53. ["source","java",subs="attributes,callouts,macros"]
  54. --------------------------------------------------
  55. include-tagged::{doc-tests-file}[{api}-response]
  56. --------------------------------------------------
  57. <1> The index name of the document.
  58. <2> The id of the document.
  59. <3> Indicates whether or not the document found.
  60. ===== Inspecting Term Vectors
  61. If +{response}+ contains non-null list of term vectors,
  62. more information about each term vector can be obtained using the following:
  63. ["source","java",subs="attributes,callouts,macros"]
  64. --------------------------------------------------
  65. include-tagged::{doc-tests-file}[{api}-term-vectors]
  66. --------------------------------------------------
  67. <1> The name of the current field
  68. <2> Fields statistics for the current field - document count
  69. <3> Fields statistics for the current field - sum of total term frequencies
  70. <4> Fields statistics for the current field - sum of document frequencies
  71. <5> Terms for the current field
  72. <6> The name of the term
  73. <7> Term frequency of the term
  74. <8> Document frequency of the term
  75. <9> Total term frequency of the term
  76. <10> Score of the term
  77. <11> Tokens of the term
  78. <12> Position of the token
  79. <13> Start offset of the token
  80. <14> End offset of the token
  81. <15> Payload of the token