index_.asciidoc 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619
  1. [[docs-index_]]
  2. === Index API
  3. ++++
  4. <titleabbrev>Index</titleabbrev>
  5. ++++
  6. IMPORTANT: See <<removal-of-types>>.
  7. Adds a JSON document to the specified data stream or index and makes
  8. it searchable. If the target is an index and the document already exists,
  9. the request updates the document and increments its version.
  10. NOTE: You cannot use the index API to send update requests for existing
  11. documents to a data stream. See <<update-docs-in-a-data-stream-by-query>>
  12. and <<update-delete-docs-in-a-backing-index>>.
  13. [[docs-index-api-request]]
  14. ==== {api-request-title}
  15. `PUT /<target>/_doc/<_id>`
  16. `POST /<target>/_doc/`
  17. `PUT /<target>/_create/<_id>`
  18. `POST /<target>/_create/<_id>`
  19. IMPORTANT: You cannot add new documents to a data stream using the
  20. `PUT /<target>/_doc/<_id>` request format. To specify a document ID, use the
  21. `PUT /<target>/_create/<_id>` format instead. See
  22. <<add-documents-to-a-data-stream>>.
  23. [[docs-index-api-prereqs]]
  24. ==== {api-prereq-title}
  25. * If the {es} {security-features} are enabled, you must have the following
  26. <<privileges-list-indices,index privileges>> for the target data stream, index,
  27. or index alias:
  28. ** To add or overwrite a document using the `PUT /<target>/_doc/<_id>` request
  29. format, you must have the `create`, `index`, or `write` index privilege.
  30. ** To add a document using the `POST /<target>/_doc/`,
  31. `PUT /<target>/_create/<_id>`, or `POST /<target>/_create/<_id>` request
  32. formats, you must have the `create_doc`, `create`, `index`, or `write` index
  33. privilege.
  34. ** To automatically create a data stream or index with an index API request, you
  35. must have the `auto_configure`, `create_index`, or `manage` index privilege.
  36. * Automatic data stream creation requires a matching index template with data
  37. stream enabled. See <<set-up-a-data-stream>>.
  38. [[docs-index-api-path-params]]
  39. ==== {api-path-parms-title}
  40. `<target>`::
  41. (Required, string) Name of the data stream or index to target.
  42. +
  43. If the target doesn't exist and matches the name or wildcard (`*`) pattern of an
  44. <<create-a-data-stream-template,index template with a `data_stream`
  45. definition>>, this request creates the data stream. See
  46. <<set-up-a-data-stream>>.
  47. +
  48. If the target doesn't exist and doesn't match a data stream template,
  49. this request creates the index.
  50. +
  51. You can check for existing targets using the resolve index API.
  52. `<_id>`::
  53. (Optional, string) Unique identifier for the document.
  54. +
  55. --
  56. This parameter is required for the following request formats:
  57. * `PUT /<target>/_doc/<_id>`
  58. * `PUT /<target>/_create/<_id>`
  59. * `POST /<target>/_create/<_id>`
  60. To automatically generate a document ID, use the `POST /<target>/_doc/` request
  61. format and omit this parameter.
  62. --
  63. [[docs-index-api-query-params]]
  64. ==== {api-query-parms-title}
  65. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=if_seq_no]
  66. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=if_primary_term]
  67. [[docs-index-api-op_type]]
  68. `op_type`::
  69. (Optional, enum) Set to `create` to only index the document
  70. if it does not already exist (_put if absent_). If a document with the specified
  71. `_id` already exists, the indexing operation will fail. Same as using the
  72. `<index>/_create` endpoint. Valid values: `index`, `create`.
  73. If document id is specified, it defaults to `index`. Otherwise, it defaults to `create`.
  74. +
  75. NOTE: If the request targets a data stream, an `op_type` of `create` is
  76. required. See <<add-documents-to-a-data-stream>>.
  77. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=pipeline]
  78. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=refresh]
  79. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=routing]
  80. `timeout`::
  81. +
  82. --
  83. (Optional, <<time-units, time units>>)
  84. Period the request waits for the following operations:
  85. * <<index-creation,Automatic index creation>>
  86. * <<dynamic-mapping,Dynamic mapping>> updates
  87. * <<index-wait-for-active-shards,Waiting for active shards>>
  88. Defaults to `1m` (one minute). This guarantees {es} waits for at least the
  89. timeout before failing. The actual wait time could be longer, particularly when
  90. multiple waits occur.
  91. --
  92. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=doc-version]
  93. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=version_type]
  94. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=wait_for_active_shards]
  95. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=require-alias]
  96. [[docs-index-api-request-body]]
  97. ==== {api-request-body-title}
  98. `<field>`::
  99. (Required, string) Request body contains the JSON source for the document
  100. data.
  101. [[docs-index-api-response-body]]
  102. ==== {api-response-body-title}
  103. `_shards`::
  104. Provides information about the replication process of the index operation.
  105. `_shards.total`::
  106. Indicates how many shard copies (primary and replica shards) the index operation
  107. should be executed on.
  108. `_shards.successful`::
  109. Indicates the number of shard copies the index operation succeeded on.
  110. When the index operation is successful, `successful` is at least 1.
  111. +
  112. NOTE: Replica shards might not all be started when an indexing operation
  113. returns successfully--by default, only the primary is required. Set
  114. `wait_for_active_shards` to change this default behavior. See
  115. <<index-wait-for-active-shards>>.
  116. `_shards.failed`::
  117. An array that contains replication-related errors in the case an index operation
  118. failed on a replica shard. 0 indicates there were no failures.
  119. `_index`::
  120. The name of the index the document was added to.
  121. `_type`::
  122. The document type. {es} indices now support a single document type, `_doc`.
  123. `_id`::
  124. The unique identifier for the added document.
  125. `_version`::
  126. The document version. Incremented each time the document is updated.
  127. `_seq_no`::
  128. The sequence number assigned to the document for the indexing operation.
  129. Sequence numbers are used to ensure an older version of a document
  130. doesn’t overwrite a newer version. See <<optimistic-concurrency-control-index>>.
  131. `_primary_term`::
  132. The primary term assigned to the document for the indexing operation.
  133. See <<optimistic-concurrency-control-index>>.
  134. `result`::
  135. The result of the indexing operation, `created` or `updated`.
  136. [[docs-index-api-desc]]
  137. ==== {api-description-title}
  138. You can index a new JSON document with the `_doc` or `_create` resource. Using
  139. `_create` guarantees that the document is only indexed if it does not already
  140. exist. To update an existing document, you must use the `_doc` resource.
  141. [[index-creation]]
  142. ===== Automatically create data streams and indices
  143. If request's target doesn't exist and matches an
  144. <<create-a-data-stream-template,index template with a `data_stream`
  145. definition>>, the index operation automatically creates the data stream. See
  146. <<set-up-a-data-stream>>.
  147. If the target doesn't exist and doesn't match a data stream template,
  148. the operation automatically creates the index and applies any matching
  149. <<index-templates,index templates>>.
  150. [IMPORTANT]
  151. ====
  152. {es} has built-in index templates for the `metrics-*-*`, `logs-*-*`, and `synthetics-*-*` index
  153. patterns, each with a priority of `100`.
  154. {fleet-guide}/fleet-overview.html[{agent}] uses these templates to
  155. create data streams. If you use {agent}, assign your index templates a priority
  156. lower than `100` to avoid overriding the built-in templates.
  157. Otherwise, to avoid accidentally applying the built-in templates, use a
  158. non-overlapping index pattern or assign templates with an overlapping pattern a
  159. `priority` higher than `100`.
  160. For example, if you don't use {agent} and want to create a template for the
  161. `logs-*` index pattern, assign your template a priority of `200`. This ensures
  162. your template is applied instead of the built-in template for `logs-*-*`.
  163. ====
  164. If no mapping exists, the index operation
  165. creates a dynamic mapping. By default, new fields and objects are
  166. automatically added to the mapping if needed. For more information about field
  167. mapping, see <<mapping,mapping>> and the <<indices-put-mapping,put mapping>> API.
  168. Automatic index creation is controlled by the `action.auto_create_index`
  169. setting. This setting defaults to `true`, which allows any index to be created
  170. automatically. You can modify this setting to explicitly allow or block
  171. automatic creation of indices that match specified patterns, or set it to
  172. `false` to disable automatic index creation entirely. Specify a
  173. comma-separated list of patterns you want to allow, or prefix each pattern with
  174. `+` or `-` to indicate whether it should be allowed or blocked. When a list is
  175. specified, the default behaviour is to disallow.
  176. IMPORTANT: The `action.auto_create_index` setting only affects the automatic
  177. creation of indices. It does not affect the creation of data streams.
  178. [source,console]
  179. --------------------------------------------------
  180. PUT _cluster/settings
  181. {
  182. "persistent": {
  183. "action.auto_create_index": "my-index-000001,index10,-index1*,+ind*" <1>
  184. }
  185. }
  186. PUT _cluster/settings
  187. {
  188. "persistent": {
  189. "action.auto_create_index": "false" <2>
  190. }
  191. }
  192. PUT _cluster/settings
  193. {
  194. "persistent": {
  195. "action.auto_create_index": "true" <3>
  196. }
  197. }
  198. --------------------------------------------------
  199. <1> Allow auto-creation of indices called `my-index-000001` or `index10`, block the
  200. creation of indices that match the pattern `index1*`, and allow creation of
  201. any other indices that match the `ind*` pattern. Patterns are matched in
  202. the order specified.
  203. <2> Disable automatic index creation entirely.
  204. <3> Allow automatic creation of any index. This is the default.
  205. [discrete]
  206. [[operation-type]]
  207. ===== Put if absent
  208. You can force a create operation by using the `_create` resource or
  209. setting the `op_type` parameter to _create_. In this case,
  210. the index operation fails if a document with the specified ID
  211. already exists in the index.
  212. [discrete]
  213. [[create-document-ids-automatically]]
  214. ===== Create document IDs automatically
  215. When using the `POST /<target>/_doc/` request format, the `op_type` is
  216. automatically set to `create` and the index operation generates a unique ID for
  217. the document.
  218. [source,console]
  219. --------------------------------------------------
  220. POST my-index-000001/_doc/
  221. {
  222. "@timestamp": "2099-11-15T13:12:00",
  223. "message": "GET /search HTTP/1.1 200 1070000",
  224. "user": {
  225. "id": "kimchy"
  226. }
  227. }
  228. --------------------------------------------------
  229. The API returns the following result:
  230. [source,console-result]
  231. --------------------------------------------------
  232. {
  233. "_shards": {
  234. "total": 2,
  235. "failed": 0,
  236. "successful": 2
  237. },
  238. "_index": "my-index-000001",
  239. "_id": "W0tpsmIBdwcYyG50zbta",
  240. "_version": 1,
  241. "_seq_no": 0,
  242. "_primary_term": 1,
  243. "result": "created"
  244. }
  245. --------------------------------------------------
  246. // TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful": 2/"successful": 1/]
  247. [discrete]
  248. [[optimistic-concurrency-control-index]]
  249. ===== Optimistic concurrency control
  250. Index operations can be made conditional and only be performed if the last
  251. modification to the document was assigned the sequence number and primary
  252. term specified by the `if_seq_no` and `if_primary_term` parameters. If a
  253. mismatch is detected, the operation will result in a `VersionConflictException`
  254. and a status code of 409. See <<optimistic-concurrency-control>> for more details.
  255. [discrete]
  256. [[index-routing]]
  257. ===== Routing
  258. By default, shard placement -- or `routing` -- is controlled by using a
  259. hash of the document's id value. For more explicit control, the value
  260. fed into the hash function used by the router can be directly specified
  261. on a per-operation basis using the `routing` parameter. For example:
  262. [source,console]
  263. --------------------------------------------------
  264. POST my-index-000001/_doc?routing=kimchy
  265. {
  266. "@timestamp": "2099-11-15T13:12:00",
  267. "message": "GET /search HTTP/1.1 200 1070000",
  268. "user": {
  269. "id": "kimchy"
  270. }
  271. }
  272. --------------------------------------------------
  273. In this example, the document is routed to a shard based on
  274. the `routing` parameter provided: "kimchy".
  275. When setting up explicit mapping, you can also use the `_routing` field
  276. to direct the index operation to extract the routing value from the
  277. document itself. This does come at the (very minimal) cost of an
  278. additional document parsing pass. If the `_routing` mapping is defined
  279. and set to be `required`, the index operation will fail if no routing
  280. value is provided or extracted.
  281. NOTE: Data streams do not support custom routing. Instead, target the
  282. appropriate backing index for the stream.
  283. [discrete]
  284. [[index-distributed]]
  285. ===== Distributed
  286. The index operation is directed to the primary shard based on its route
  287. (see the Routing section above) and performed on the actual node
  288. containing this shard. After the primary shard completes the operation,
  289. if needed, the update is distributed to applicable replicas.
  290. [discrete]
  291. [[index-wait-for-active-shards]]
  292. ===== Active shards
  293. To improve the resiliency of writes to the system, indexing operations
  294. can be configured to wait for a certain number of active shard copies
  295. before proceeding with the operation. If the requisite number of active
  296. shard copies are not available, then the write operation must wait and
  297. retry, until either the requisite shard copies have started or a timeout
  298. occurs. By default, write operations only wait for the primary shards
  299. to be active before proceeding (i.e. `wait_for_active_shards=1`).
  300. This default can be overridden in the index settings dynamically
  301. by setting `index.write.wait_for_active_shards`. To alter this behavior
  302. per operation, the `wait_for_active_shards` request parameter can be used.
  303. Valid values are `all` or any positive integer up to the total number
  304. of configured copies per shard in the index (which is `number_of_replicas+1`).
  305. Specifying a negative value or a number greater than the number of
  306. shard copies will throw an error.
  307. For example, suppose we have a cluster of three nodes, `A`, `B`, and `C` and
  308. we create an index `index` with the number of replicas set to 3 (resulting in
  309. 4 shard copies, one more copy than there are nodes). If we
  310. attempt an indexing operation, by default the operation will only ensure
  311. the primary copy of each shard is available before proceeding. This means
  312. that even if `B` and `C` went down, and `A` hosted the primary shard copies,
  313. the indexing operation would still proceed with only one copy of the data.
  314. If `wait_for_active_shards` is set on the request to `3` (and all 3 nodes
  315. are up), then the indexing operation will require 3 active shard copies
  316. before proceeding, a requirement which should be met because there are 3
  317. active nodes in the cluster, each one holding a copy of the shard. However,
  318. if we set `wait_for_active_shards` to `all` (or to `4`, which is the same),
  319. the indexing operation will not proceed as we do not have all 4 copies of
  320. each shard active in the index. The operation will timeout
  321. unless a new node is brought up in the cluster to host the fourth copy of
  322. the shard.
  323. It is important to note that this setting greatly reduces the chances of
  324. the write operation not writing to the requisite number of shard copies,
  325. but it does not completely eliminate the possibility, because this check
  326. occurs before the write operation commences. Once the write operation
  327. is underway, it is still possible for replication to fail on any number of
  328. shard copies but still succeed on the primary. The `_shards` section of the
  329. write operation's response reveals the number of shard copies on which
  330. replication succeeded/failed.
  331. [source,js]
  332. --------------------------------------------------
  333. {
  334. "_shards": {
  335. "total": 2,
  336. "failed": 0,
  337. "successful": 2
  338. }
  339. }
  340. --------------------------------------------------
  341. // NOTCONSOLE
  342. [discrete]
  343. [[index-refresh]]
  344. ===== Refresh
  345. Control when the changes made by this request are visible to search. See
  346. <<docs-refresh,refresh>>.
  347. [discrete]
  348. [[index-noop]]
  349. ===== Noop updates
  350. When updating a document using the index API a new version of the document is
  351. always created even if the document hasn't changed. If this isn't acceptable
  352. use the `_update` API with `detect_noop` set to true. This option isn't
  353. available on the index API because the index API doesn't fetch the old source
  354. and isn't able to compare it against the new source.
  355. There isn't a hard and fast rule about when noop updates aren't acceptable.
  356. It's a combination of lots of factors like how frequently your data source
  357. sends updates that are actually noops and how many queries per second
  358. Elasticsearch runs on the shard receiving the updates.
  359. [discrete]
  360. [[timeout]]
  361. ===== Timeout
  362. The primary shard assigned to perform the index operation might not be
  363. available when the index operation is executed. Some reasons for this
  364. might be that the primary shard is currently recovering from a gateway
  365. or undergoing relocation. By default, the index operation will wait on
  366. the primary shard to become available for up to 1 minute before failing
  367. and responding with an error. The `timeout` parameter can be used to
  368. explicitly specify how long it waits. Here is an example of setting it
  369. to 5 minutes:
  370. [source,console]
  371. --------------------------------------------------
  372. PUT my-index-000001/_doc/1?timeout=5m
  373. {
  374. "@timestamp": "2099-11-15T13:12:00",
  375. "message": "GET /search HTTP/1.1 200 1070000",
  376. "user": {
  377. "id": "kimchy"
  378. }
  379. }
  380. --------------------------------------------------
  381. [discrete]
  382. [[index-versioning]]
  383. ===== Versioning
  384. Each indexed document is given a version number. By default,
  385. internal versioning is used that starts at 1 and increments
  386. with each update, deletes included. Optionally, the version number can be
  387. set to an external value (for example, if maintained in a
  388. database). To enable this functionality, `version_type` should be set to
  389. `external`. The value provided must be a numeric, long value greater than or equal to 0,
  390. and less than around 9.2e+18.
  391. When using the external version type, the system checks to see if
  392. the version number passed to the index request is greater than the
  393. version of the currently stored document. If true, the document will be
  394. indexed and the new version number used. If the value provided is less
  395. than or equal to the stored document's version number, a version
  396. conflict will occur and the index operation will fail. For example:
  397. [source,console]
  398. --------------------------------------------------
  399. PUT my-index-000001/_doc/1?version=2&version_type=external
  400. {
  401. "user": {
  402. "id": "elkbee"
  403. }
  404. }
  405. --------------------------------------------------
  406. // TEST[continued]
  407. NOTE: Versioning is completely real time, and is not affected by the
  408. near real time aspects of search operations. If no version is provided,
  409. then the operation is executed without any version checks.
  410. In the previous example, the operation will succeed since the supplied
  411. version of 2 is higher than
  412. the current document version of 1. If the document was already updated
  413. and its version was set to 2 or higher, the indexing command will fail
  414. and result in a conflict (409 http status code).
  415. A nice side effect is that there is no need to maintain strict ordering
  416. of async indexing operations executed as a result of changes to a source
  417. database, as long as version numbers from the source database are used.
  418. Even the simple case of updating the Elasticsearch index using data from
  419. a database is simplified if external versioning is used, as only the
  420. latest version will be used if the index operations arrive out of order for
  421. whatever reason.
  422. [discrete]
  423. [[index-version-types]]
  424. ===== Version types
  425. In addition to the `external` version type, Elasticsearch
  426. also supports other types for specific use cases:
  427. [[_version_types]]
  428. `internal`:: Only index the document if the given version is identical to the version
  429. of the stored document.
  430. `external` or `external_gt`:: Only index the document if the given version is strictly higher
  431. than the version of the stored document *or* if there is no existing document. The given
  432. version will be used as the new version and will be stored with the new document. The supplied
  433. version must be a non-negative long number.
  434. `external_gte`:: Only index the document if the given version is *equal* or higher
  435. than the version of the stored document. If there is no existing document
  436. the operation will succeed as well. The given version will be used as the new version
  437. and will be stored with the new document. The supplied version must be a non-negative long number.
  438. NOTE: The `external_gte` version type is meant for special use cases and
  439. should be used with care. If used incorrectly, it can result in loss of data.
  440. There is another option, `force`, which is deprecated because it can cause
  441. primary and replica shards to diverge.
  442. [[docs-index-api-example]]
  443. ==== {api-examples-title}
  444. Insert a JSON document into the `my-index-000001` index with an `_id` of 1:
  445. [source,console]
  446. --------------------------------------------------
  447. PUT my-index-000001/_doc/1
  448. {
  449. "@timestamp": "2099-11-15T13:12:00",
  450. "message": "GET /search HTTP/1.1 200 1070000",
  451. "user": {
  452. "id": "kimchy"
  453. }
  454. }
  455. --------------------------------------------------
  456. The API returns the following result:
  457. [source,console-result]
  458. --------------------------------------------------
  459. {
  460. "_shards": {
  461. "total": 2,
  462. "failed": 0,
  463. "successful": 2
  464. },
  465. "_index": "my-index-000001",
  466. "_id": "1",
  467. "_version": 1,
  468. "_seq_no": 0,
  469. "_primary_term": 1,
  470. "result": "created"
  471. }
  472. --------------------------------------------------
  473. // TESTRESPONSE[s/"successful": 2/"successful": 1/]
  474. Use the `_create` resource to index a document into the `my-index-000001` index if
  475. no document with that ID exists:
  476. [source,console]
  477. --------------------------------------------------
  478. PUT my-index-000001/_create/1
  479. {
  480. "@timestamp": "2099-11-15T13:12:00",
  481. "message": "GET /search HTTP/1.1 200 1070000",
  482. "user": {
  483. "id": "kimchy"
  484. }
  485. }
  486. --------------------------------------------------
  487. Set the `op_type` parameter to _create_ to index a document into the `my-index-000001`
  488. index if no document with that ID exists:
  489. [source,console]
  490. --------------------------------------------------
  491. PUT my-index-000001/_doc/1?op_type=create
  492. {
  493. "@timestamp": "2099-11-15T13:12:00",
  494. "message": "GET /search HTTP/1.1 200 1070000",
  495. "user": {
  496. "id": "kimchy"
  497. }
  498. }
  499. --------------------------------------------------