Browse Source

Reformats term vectors APIs (#47484)

* Reformat termvectors APIs

* Reformats mtermvectors

* Apply suggestions from code review

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Incorporated review feedback.
debadair 6 years ago
parent
commit
07f4ca799b

+ 69 - 16
docs/reference/docs/multi-termvectors.asciidoc

@@ -1,14 +1,10 @@
 [[docs-multi-termvectors]]
-=== Multi termvectors API
+=== Multi term vectors API
+++++
+<titleabbrev>Multi term vectors</titleabbrev>
+++++
 
-Multi termvectors API allows to get multiple termvectors at once. The
-documents from which to retrieve the term vectors are specified by an index and id.
-But the documents could also be artificially provided in the request itself.
-
-The response includes a `docs`
-array with all the fetched termvectors, each element having the structure
-provided by the <<docs-termvectors,termvectors>>
-API. Here is an example:
+Retrieves multiple term vectors with a single request. 
 
 [source,console]
 --------------------------------------------------
@@ -32,10 +28,64 @@ POST /_mtermvectors
 --------------------------------------------------
 // TEST[setup:twitter]
 
-See the <<docs-termvectors,termvectors>> API for a description of possible parameters.
+[[docs-multi-termvectors-api-request]]
+==== {api-request-title}
+
+`POST /_mtermvectors`
+
+`POST /<index>/_mtermvectors`
+
+[[docs-multi-termvectors-api-desc]]
+==== {api-description-title}
+
+You can specify existing documents by index and ID or 
+provide artificial documents in the body of the request.  
+The index can be specified the body of the request or in the request URI.
+
+The response contains a `docs` array with all the fetched termvectors. 
+Each element has the structure provided by the <<docs-termvectors,termvectors>>
+API. 
+
+See the <<docs-termvectors,termvectors>> API for more information about the information
+that can be included in the response.  
+
+[[docs-multi-termvectors-api-path-params]]
+==== {api-path-parms-title}
+
+`<index>`::
+(Optional, string) Name of the index that contains the documents.
+
+[[docs-multi-termvectors-api-query-params]]
+==== {api-query-parms-title}
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=fields]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=field_statistics]
 
-The `_mtermvectors` endpoint can also be used against an index (in which case it
-is not required in the body):
+include::{docdir}/rest-api/common-parms.asciidoc[tag=offsets]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=payloads]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=positions]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=preference]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=routing]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=realtime]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=term_statistics]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=version]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=version_type]
+
+[float]
+[[docs-multi-termvectors-api-example]]
+==== {api-examples-title}
+
+If you specify an index in the request URI, the index does not need to be specified for each documents
+in the request body:
 
 [source,console]
 --------------------------------------------------
@@ -57,7 +107,8 @@ POST /twitter/_mtermvectors
 --------------------------------------------------
 // TEST[setup:twitter]
 
-If all requested documents are on same index and also the parameters are the same, the request can be simplified:
+If all requested documents are in same index and the parameters are the same, you can use the
+following simplified syntax:
 
 [source,console]
 --------------------------------------------------
@@ -74,9 +125,11 @@ POST /twitter/_mtermvectors
 --------------------------------------------------
 // TEST[setup:twitter]
 
-Additionally, just like for the <<docs-termvectors,termvectors>>
-API, term vectors could be generated for user provided documents.
-The mapping used is determined by `_index`.
+[[docs-multi-termvectors-artificial-doc]]
+===== Artificial documents
+
+You can also use `mtermvectors` to generate term vectors for _artificial_ documents provided
+in the body of the request. The mapping used is determined by the specified `_index`.
 
 [source,console]
 --------------------------------------------------

+ 72 - 29
docs/reference/docs/termvectors.asciidoc

@@ -1,10 +1,10 @@
 [[docs-termvectors]]
-=== Term Vectors
+=== Term vectors API
+++++
+<titleabbrev>Term vectors</titleabbrev>
+++++
 
-Returns information and statistics on terms in the fields of a particular
-document. The document could be stored in the index or artificially provided
-by the user. Term vectors are <<realtime,realtime>> by default, not near
-realtime. This can be changed by setting `realtime` parameter to `false`.
+Retrieves information and statistics for terms in the fields of a particular document. 
 
 [source,console]
 --------------------------------------------------
@@ -12,8 +12,19 @@ GET /twitter/_termvectors/1
 --------------------------------------------------
 // TEST[setup:twitter]
 
-Optionally, you can specify the fields for which the information is
-retrieved either with a parameter in the url
+[[docs-termvectors-api-request]]
+==== {api-request-title}
+
+`GET /<index>/_termvectors/<_id>`
+
+[[docs-termvectors-api-desc]]
+==== {api-description-title}
+
+You can retrieve term vectors for documents stored in the index or 
+for _artificial_ documents passed in the body of the request. 
+
+You can specify the fields you are interested in through the `fields` parameter,
+or by adding the fields to the request body. 
 
 [source,console]
 --------------------------------------------------
@@ -21,18 +32,16 @@ GET /twitter/_termvectors/1?fields=message
 --------------------------------------------------
 // TEST[setup:twitter]
 
-or by adding the requested fields in the request body (see
-example below). Fields can also be specified with wildcards
-in similar way to the <<query-dsl-multi-match-query,multi match query>>
+Fields can be specified using wildcards, similar to the <<query-dsl-multi-match-query,multi match query>>.
 
-[float]
-==== Return values
+Term vectors are <<realtime,real-time>> by default, not near real-time. 
+This can be changed by setting `realtime` parameter to `false`.
 
-Three types of values can be requested: _term information_, _term statistics_
+You can request three types of values: _term information_, _term statistics_
 and _field statistics_. By default, all term information and field
-statistics are returned for all fields but no term statistics.
+statistics are returned for all fields but term statistics are excluded.
 
-[float]
+[[docs-termvectors-api-term-info]]
 ===== Term information
 
  * term frequency in the field (always returned)
@@ -52,7 +61,7 @@ should make sure that the string you are taking a sub-string of is also encoded
 using UTF-16.
 ======
 
-[float]
+[[docs-termvectors-api-term-stats]]
 ===== Term statistics
 
 Setting `term_statistics` to `true` (default is `false`) will
@@ -65,7 +74,7 @@ return
 By default these values are not returned since term statistics can
 have a serious performance impact.
 
-[float]
+[[docs-termvectors-api-field-stats]]
 ===== Field statistics
 
 Setting `field_statistics` to `false` (default is `true`) will
@@ -77,8 +86,8 @@ omit :
  * sum of total term frequencies (the sum of total term frequencies of
    each term in this field)
 
-[float]
-===== Terms Filtering
+[[docs-termvectors-api-terms-filtering]]
+===== Terms filtering
 
 With the parameter `filter`, the terms returned could also be filtered based
 on their tf-idf scores. This could be useful in order find out a good
@@ -105,7 +114,7 @@ The following sub-parameters are supported:
 `max_word_length`::
   The maximum word length above which words will be ignored. Defaults to unbounded (`0`).
 
-[float]
+[[docs-termvectors-api-behavior]]
 ==== Behaviour
 
 The term and field statistics are not accurate. Deleted documents
@@ -116,8 +125,45 @@ whereas the absolute numbers have no meaning in this context. By default,
 when requesting term vectors of artificial documents, a shard to get the statistics
 from is randomly selected. Use `routing` only to hit a particular shard.
 
-[float]
-===== Example: Returning stored term vectors
+[[docs-termvectors-api-path-params]]
+==== {api-path-parms-title}
+
+`<index>`::
+(Required, string) Name of the index that contains the document.
+
+`<_id>`::
+(Optional, string) Unique identifier of the document.
+
+[[docs-termvectors-api-query-params]]
+==== {api-query-parms-title}
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=fields]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=field_statistics]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=offsets]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=payloads]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=positions]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=preference]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=routing]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=realtime]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=term_statistics]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=version]
+
+include::{docdir}/rest-api/common-parms.asciidoc[tag=version_type]
+
+[[docs-termvectors-api-example]]
+==== {api-examples-title}
+
+[[docs-termvectors-api-stored-termvectors]]
+===== Returning stored term vectors
 
 First, we create an index that stores term vectors, payloads etc. :
 
@@ -259,8 +305,8 @@ Response:
 // TEST[continued]
 // TESTRESPONSE[s/"took": 6/"took": "$body.took"/]
 
-[float]
-===== Example: Generating term vectors on the fly
+[[docs-termvectors-api-generate-termvectors]]
+===== Generating term vectors on the fly
 
 Term vectors which are not explicitly stored in the index are automatically
 computed on the fly. The following request returns all information and statistics for the
@@ -281,8 +327,7 @@ GET /twitter/_termvectors/1
 // TEST[continued]
 
 [[docs-termvectors-artificial-doc]]
-[float]
-===== Example: Artificial documents
+===== Artificial documents
 
 Term vectors can also be generated for artificial documents,
 that is for documents not present in the index.  For example, the following request would
@@ -304,7 +349,6 @@ GET /twitter/_termvectors
 // TEST[continued]
 
 [[docs-termvectors-per-field-analyzer]]
-[float]
 ====== Per-field analyzer
 
 Additionally, a different analyzer than the one at the field may be provided
@@ -369,8 +413,7 @@ Response:
 
 
 [[docs-termvectors-terms-filtering]]
-[float]
-===== Example: Terms filtering
+===== Terms filtering
 
 Finally, the terms returned could be filtered based on their tf-idf scores. In
 the example below we obtain the three most "interesting" keywords from the

+ 38 - 7
docs/reference/rest-api/common-parms.asciidoc

@@ -135,6 +135,13 @@ Wildcard expressions are not accepted.
 --
 end::expand-wildcards[]
 
+tag::field_statistics[]
+`field_statistics`::
+(Optional, boolean) If `true`, the response includes the document count, sum of document frequencies, 
+and sum of total term frequencies.
+Defaults to `true`.
+end::field_statistics[]
+
 tag::fielddata-fields[]
 `fielddata_fields`::
 (Optional, string)
@@ -222,7 +229,7 @@ end::cat-h[]
 
 tag::help[]
 `help`::
-(Optional, boolean) If `true`, the response returns help information. Defaults
+(Optional, boolean) If `true`, the response includes help information. Defaults
 to `false`.
 end::help[]
 
@@ -444,6 +451,12 @@ Comma-separated list of node IDs or names
 used to limit returned information.
 end::node-id-query-parm[]
 
+tag::offsets[]
+`<offsets>`::
+(Optional, boolean) If `true`, the response includes term offsets.
+Defaults to `true`.
+end::offsets[]
+
 tag::parent-task-id[]
 `parent_task_id`::
 +
@@ -469,6 +482,18 @@ tag::path-pipeline[]
 used to limit the request.
 end::path-pipeline[]
 
+tag::payloads[]
+`payloads`::
+(Optional, boolean) If `true`, the response includes term payloads.
+Defaults to `true`.
+end::payloads[]
+
+tag::positions[]
+`positions`::
+(Optional, boolean) If `true`, the response includes term positions.
+Defaults to `true`.
+end::positions[]
+
 tag::preference[]
 `preference`::
 (Optional, string) Specifies the node or shard the operation should be 
@@ -488,8 +513,8 @@ end::query[]
 
 tag::realtime[]
 `realtime`::
-(Optional, boolean) Set to `false` to disable real time GET
-(default: `true`). See <<realtime>>.
+(Optional, boolean) If `true`, the request is real-time as opposed to near-real-time. 
+Defaults to `true`. See <<realtime>>.
 end::realtime[]
 
 tag::refresh[]
@@ -502,8 +527,8 @@ end::refresh[]
 
 tag::request_cache[]
 `request_cache`::
-(Optional, boolean) Specifies if the request cache should be used for this
-request. Defaults to the index-level setting.
+(Optional, boolean) If `true`, the request cache is used for this request. 
+Defaults to the index-level setting.
 end::request_cache[]
 
 tag::requests_per_second[]
@@ -645,6 +670,12 @@ tag::task-id[]
 (`node_id:task_number`).
 end::task-id[]
 
+tag::term_statistics[]
+`term_statistics`::
+(Optional, boolean) If `true`, the response includes term frequency and document frequency. 
+Defaults to `false`.
+end::term_statistics[]
+
 tag::terminate_after[]
 `terminate_after`::
 (Optional, integer) The maximum number of documents to collect for each shard, 
@@ -671,8 +702,8 @@ end::timeoutparms[]
 
 tag::cat-v[]
 `v`::
-(Optional, boolean) If `true`, the response includes column headings. Defaults
-to `false`.
+(Optional, boolean) If `true`, the response includes column headings. 
+Defaults to `false`.
 end::cat-v[]
 
 tag::version[]