| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393 | [[docs-index_]]== Index APIThe index API adds or updates a typed JSON document in a specific index,making it searchable. The following example inserts the JSON documentinto the "twitter" index, under a type called "tweet" with an id of 1:[source,js]--------------------------------------------------$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------The result of the above index operation is:[source,js]--------------------------------------------------{    "_index" : "twitter",    "_type" : "tweet",    "_id" : "1",    "_version" : 1,    "created" : true}--------------------------------------------------[float][[index-creation]]=== Automatic Index CreationThe index operation automatically creates an index if it has not beencreated before (check out the<<indices-create-index,create index API>> for manuallycreating an index), and also automatically creates adynamic type mapping for the specific type if one has not yet beencreated (check out the <<indices-put-mapping,put mapping>>API for manually creating a type mapping).The mapping itself is very flexible and is schema-free. New fields andobjects will automatically be added to the mapping definition of thetype specified. Check out the <<mapping,mapping>>section for more information on mapping definitions.Note that the format of the JSON document can also include the type (very handywhen using JSON mappers) if the `index.mapping.allow_type_wrapper` setting isset to true, for example:[source,js]--------------------------------------------------$ curl -XPOST 'http://localhost:9200/twitter' -d '{  "settings": {    "index": {      "mapping.allow_type_wrapper": true    }  }}'{"acknowledged":true}$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{    "tweet" : {        "user" : "kimchy",        "post_date" : "2009-11-15T14:12:12",        "message" : "trying out Elasticsearch"    }}'--------------------------------------------------Automatic index creation can be disabled by setting`action.auto_create_index` to `false` in the config file of all nodes.Automatic mapping creation can be disabled by setting`index.mapper.dynamic` to `false` in the config files of all nodes (oron the specific index settings).Automatic index creation can include a pattern based white/black list,for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+meaning allowed, and - meaning disallowed).[float][[index-versioning]]=== VersioningEach indexed document is given a version number. The associated`version` number is returned as part of the response to the index APIrequest. The index API optionally allows forhttp://en.wikipedia.org/wiki/Optimistic_concurrency_control[optimisticconcurrency control] when the `version` parameter is specified. Thiswill control the version of the document the operation is intended to beexecuted against. A good example of a use case for versioning isperforming a transactional read-then-update. Specifying a `version` fromthe document initially read ensures no changes have happened in themeantime (when reading in order to update, it is recommended to set`preference` to `_primary`). For example:[source,js]--------------------------------------------------curl -XPUT 'localhost:9200/twitter/tweet/1?version=2' -d '{    "message" : "elasticsearch now has versioning support, double cool!"}'--------------------------------------------------*NOTE:* versioning is completely real time, and is not affected by thenear real time aspects of search operations. If no version is provided,then the operation is executed without any version checks.By default, internal versioning is used that starts at 1 and incrementswith each update, deletes included. Optionally, the version number can besupplemented with an external value (for example, if maintained in adatabase). To enable this functionality, `version_type` should be set to`external`. The value provided must be a numeric, long value greater or equal to 0,and less than around 9.2e+18. When using the external version type, insteadof checking for a matching version number, the system checks to see ifthe version number passed to the index request is greater than theversion of the currently stored document. If true, the document will beindexed and the new version number used. If the value provided is lessthan or equal to the stored document's version number, a versionconflict will occur and the index operation will fail.A nice side effect is that there is no need to maintain strict orderingof async indexing operations executed as a result of changes to a sourcedatabase, as long as version numbers from the source database are used.Even the simple case of updating the elasticsearch index using data froma database is simplified if external versioning is used, as only thelatest version will be used if the index operations are out of order forwhatever reason.[float]==== Version typesNext to the `internal` & `external` version types explained above, Elasticsearchalso supports other types for specific use cases. Here is an overview ofthe different version types and their semantics.`internal`:: only index the document if the given version is identical to the versionof the stored document.`external` or `external_gt`:: only index the document if the given version is strictly higherthan the version of the stored document *or* if there is no existing document. The givenversion will be used as the new version and will be stored with the new document. The suppliedversion must be a non-negative long number.`external_gte`:: only index the document if the given version is *equal* or higherthan the version of the stored document. If there is no existing documentthe operation will succeed as well. The given version will be used as the new versionand will be stored with the new document. The supplied version must be a non-negative long number.`force`:: the document will be indexed regardless of the version of the stored document or if thereis no existing document. The given version will be used as the new version and will be storedwith the new document. This version type is typically used for correcting errors.*NOTE*: The `external_gte` & `force` version types are meant for special use cases and should be usedwith care. If used incorrectly, they can result in loss of data.[float][[operation-type]]=== Operation TypeThe index operation also accepts an `op_type` that can be used to forcea `create` operation, allowing for "put-if-absent" behavior. When`create` is used, the index operation will fail if a document by that idalready exists in the index.Here is an example of using the `op_type` parameter:[source,js]--------------------------------------------------$ curl -XPUT 'http://localhost:9200/twitter/tweet/1?op_type=create' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------Another option to specify `create` is to use the following uri:[source,js]--------------------------------------------------$ curl -XPUT 'http://localhost:9200/twitter/tweet/1/_create' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------[float]=== Automatic ID GenerationThe index operation can be executed without specifying the id. In such acase, an id will be generated automatically. In addition, the `op_type`will automatically be set to `create`. Here is an example (note the*POST* used instead of *PUT*):[source,js]--------------------------------------------------$ curl -XPOST 'http://localhost:9200/twitter/tweet/' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------The result of the above index operation is:[source,js]--------------------------------------------------{    "_index" : "twitter",    "_type" : "tweet",    "_id" : "6a8ca01c-7896-48e9-81cc-9f70661fcb32",    "_version" : 1,    "created" : true}--------------------------------------------------[float][[index-routing]]=== RoutingBy default, shard placement — or `routing` — is controlled by using ahash of the document's id value. For more explicit control, the valuefed into the hash function used by the router can be directly specifiedon a per-operation basis using the `routing` parameter. For example:[source,js]--------------------------------------------------$ curl -XPOST 'http://localhost:9200/twitter/tweet?routing=kimchy' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------In the example above, the "tweet" document is routed to a shard based onthe `routing` parameter provided: "kimchy".When setting up explicit mapping, the `_routing` field can be optionallyused to direct the index operation to extract the routing value from thedocument itself. This does come at the (very minimal) cost of anadditional document parsing pass. If the `_routing` mapping is defined,and set to be `required`, the index operation will fail if no routingvalue is provided or extracted.[float][[parent-children]]=== Parents & ChildrenA child document can be indexed by specifying its parent when indexing.For example:[source,js]--------------------------------------------------$ curl -XPUT localhost:9200/blogs/blog_tag/1122?parent=1111 -d '{    "tag" : "something"}'--------------------------------------------------When indexing a child document, the routing value is automatically setto be the same as its parent, unless the routing value is explicitlyspecified using the `routing` parameter.[float][[index-timestamp]]=== TimestampA document can be indexed with a `timestamp` associated with it. The`timestamp` value of a document can be set using the `timestamp`parameter. For example:[source,js]--------------------------------------------------$ curl -XPUT localhost:9200/twitter/tweet/1?timestamp=2009-11-15T14%3A12%3A12 -d '{    "user" : "kimchy",    "message" : "trying out Elasticsearch"}'--------------------------------------------------If the `timestamp` value is not provided externally or in the `_source`,the `timestamp` will be automatically set to the date the document wasprocessed by the indexing chain. More information can be found on the<<mapping-timestamp-field,_timestamp mappingpage>>.[float][[index-ttl]]=== TTLA document can be indexed with a `ttl` (time to live) associated withit. Expired documents will be expunged automatically. The expirationdate that will be set for a document with a provided `ttl` is relativeto the `timestamp` of the document, meaning it can be based on the timeof indexing or on any time provided. The provided `ttl` must be strictlypositive and can be a number (in milliseconds) or any valid time valueas shown in the following examples:[source,js]--------------------------------------------------curl -XPUT 'http://localhost:9200/twitter/tweet/1?ttl=86400000' -d '{    "user": "kimchy",    "message": "Trying out elasticsearch, so far so good?"}'--------------------------------------------------[source,js]--------------------------------------------------curl -XPUT 'http://localhost:9200/twitter/tweet/1?ttl=1d' -d '{    "user": "kimchy",    "message": "Trying out elasticsearch, so far so good?"}'--------------------------------------------------[source,js]--------------------------------------------------curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{    "_ttl": "1d",    "user": "kimchy",    "message": "Trying out elasticsearch, so far so good?"}'--------------------------------------------------More information can be found on the<<mapping-ttl-field,_ttl mapping page>>.[float][[index-distributed]]=== DistributedThe index operation is directed to the primary shard based on its route(see the Routing section above) and performed on the actual nodecontaining this shard. After the primary shard completes the operation,if needed, the update is distributed to applicable replicas.[float][[index-consistency]]=== Write ConsistencyTo prevent writes from taking place on the "wrong" side of a networkpartition, by default, index operations only succeed if a quorum(>replicas/2+1) of active shards are available. This default can beoverridden on a node-by-node basis using the `action.write_consistency`setting. To alter this behavior per-operation, the `consistency` requestparameter can be used.Valid write consistency values are `one`, `quorum`, and `all`.Note, for the case where the number of replicas is 1 (total of 2 copiesof the data), then the default behavior is to succeed if 1 copy (the primary)can perform the write.[float][[index-replication]]=== Asynchronous ReplicationBy default, the index operation only returns after all shards within thereplication group have indexed the document (sync replication). Toenable asynchronous replication, causing the replication process to takeplace in the background, set the `replication` parameter to `async`.When asynchronous replication is used, the index operation will returnas soon as the operation succeeds on the primary shard.[float][[index-refresh]]=== RefreshTo refresh the shard (not the whole index) immediately after the operationoccurs, so that the document appears in search results immediately, the`refresh` parameter can be set to `true`. Setting this option to `true` should*ONLY* be done after careful thought and verification that it does not lead topoor performance, both from an indexing and a search standpoint. Note, gettinga document using the get API is completely realtime.[float][[timeout]]=== TimeoutThe primary shard assigned to perform the index operation might not beavailable when the index operation is executed. Some reasons for thismight be that the primary shard is currently recovering from a gatewayor undergoing relocation. By default, the index operation will wait onthe primary shard to become available for up to 1 minute before failingand responding with an error. The `timeout` parameter can be used toexplicitly specify how long it waits. Here is an example of setting itto 5 minutes:[source,js]--------------------------------------------------$ curl -XPUT 'http://localhost:9200/twitter/tweet/1?timeout=5m' -d '{    "user" : "kimchy",    "post_date" : "2009-11-15T14:12:12",    "message" : "trying out Elasticsearch"}'--------------------------------------------------
 |