| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228 | [role="xpack"][testenv="platinum"][[ccr-overview]]=== Overview{ccr-cap} is done on an index-by-index basis. Replication isconfigured at the index level. For each configured replication there is areplication source index called the _leader index_ and a replication targetindex called the _follower index_.Replication is active-passive. This means that while the leader indexcan directly be written into, the follower index can not directly receivewrites.Replication is pull-based. This means that replication is driven by thefollower index. This simplifies state management on the leader index and meansthat {ccr} does not interfere with indexing on the leader index.In {ccr}, the cluster performing this pull is known as the _local cluster_. Thecluster being replicated is known as the _remote cluster_.==== Prerequisites* {ccr-cap} requires <<modules-remote-clusters, remote clusters>>.* The {es} version of the local cluster must be **the same as or newer** thanthe remote cluster. If newer, the versions must also be compatible as outlinedin the following matrix.include::../modules/remote-clusters.asciidoc[tag=remote-cluster-compatibility-matrix]==== Configuring replicationReplication can be configured in two ways:* Manually creating specific follower indices (in {kib} or by using the{ref}/ccr-put-follow.html[create follower API])* Automatically creating follower indices from auto-follow patterns (in {kib} orby using the {ref}/ccr-put-auto-follow-pattern.html[create auto-follow pattern API])For more information about managing {ccr} in {kib}, see{kibana-ref}/working-remote-clusters.html[Working with remote clusters].NOTE: You must also <<ccr-requirements,configure the leader index>>.When you initiate replication either manually or through an auto-follow pattern, thefollower index is created on the local cluster. Once the follower index is created,the <<remote-recovery, remote recovery>> process copies all of the Lucene segmentfiles from the remote cluster to the local cluster.By default, if you initiate following manually (by using {kib} or the create follower API),the recovery process is asynchronous in relationship to the{ref}/ccr-put-follow.html[create follower request]. The request returns beforethe <<remote-recovery, remote recovery>> process completes. If you would like to wait onthe process to complete, you can use the `wait_for_active_shards` parameter.//////////////////////////[source,console]--------------------------------------------------PUT /follower_index/_ccr/follow?wait_for_active_shards=1{  "remote_cluster" : "remote_cluster",  "leader_index" : "leader_index"}--------------------------------------------------// TESTSETUP// TEST[setup:remote_cluster_and_leader_index][source,console]--------------------------------------------------POST /follower_index/_ccr/pause_follow--------------------------------------------------// TEARDOWN//////////////////////////==== The mechanics of replicationWhile replication is managed at the index level, replication is performed at theshard level. When a follower index is created, it is automaticallyconfigured to have an identical number of shards as the leader index. A followershard task in the follower index pulls from the corresponding leader shard inthe leader index by sending read requests for new operations. These readrequests can be served from any copy of the leader shard (primary or replicas).For each read request sent by the follower shard task, if there are newoperations available on the leader shard, the leader shard responds withoperations limited by the read parameters that you established when youconfigured the follower index. If there are no new operations available on theleader shard, the leader shard waits up to a configured timeout for newoperations. If new operations occur within that timeout, the leader shardimmediately responds with those new operations. Otherwise, if the timeoutelapses, the leader shard replies that there are no new operations. Thefollower shard task updates some statistics and immediately sends another readrequest to the leader shard. This ensures that the network connections betweenthe remote cluster and the local cluster are continually being used so as toavoid forceful termination by an external source (such as a firewall).If a read request fails, the cause of the failure is inspected. If thecause of the failure is deemed to be a failure that can be recovered from (for example, a network failure), the follower shard task enters into a retryloop. Otherwise, the follower shard task is paused and requires userintervention before it can be resumed with the{ref}/ccr-post-resume-follow.html[resume follower API].When operations are received by the follower shard task, they are placed in awrite buffer. The follower shard task manages this write buffer and submitsbulk write requests from this write buffer to the follower shard.  The writebuffer and these write requests are managed by the write parameters that you established when you configured the follower index.  The write buffer serves asback-pressure against read requests. If the write buffer exceeds its configuredlimits, no additional read requests are sent by the follower shard task. Thefollower shard task resumes sending read requests when the write buffer nolonger exceeds its configured limits.NOTE: The intricacies of how operations are replicated from the leader aregoverned by settings that you can configure when you create the follower indexin {kib} or by using the {ref}/ccr-put-follow.html[create follower API].Mapping updates applied to the leader index are automatically retrievedas-needed by the follower index. It is not possible to manually modify themapping of a follower index.Settings updates applied to the leader index that are needed by the followerindex are automatically retrieved as-needed by the follower index. Not allsettings updates are needed by the follower index. For example, changing thenumber of replicas on the leader index is not replicated by the follower index.Alias updates applied to the leader index are automatically retrieved by thefollower index. It is not possible to manually modify an alias of a followerindex.NOTE: If you apply a non-dynamic settings change to the leader index that isneeded by the follower index, the follower index will go through a cycle ofclosing itself, applying the settings update, and then re-opening itself. Thefollower index will be unavailable for reads and not replicating writesduring this cycle.==== Inspecting the progress of replicationYou can inspect the progress of replication at the shard level with the{ref}/ccr-get-follow-stats.html[get follower stats API]. This API gives youinsight into the read and writes managed by the follower shard task. It alsoreports read exceptions that can be retried and fatal exceptions that requireuser intervention.==== Pausing and resuming replicationYou can pause replication with the{ref}/ccr-post-pause-follow.html[pause follower API] and then later resumereplication with the {ref}/ccr-post-resume-follow.html[resume follower API].Using these APIs in tandem enables you to adjust the read and write parameterson the follower shard task if your initial configuration is not suitable foryour use case.==== Leader index retaining operations for replicationIf the follower is unable to replicate operations from a leader for a period oftime, the following process can fail due to the leader lacking a complete historyof operations necessary for replication.Operations replicated to the follower are identified using a sequence numbergenerated when the operation was initially performed. Lucene segment files areoccasionally merged in order to optimize searches and save space. When thesemerges occur, it is possible for operations associated with deleted or updateddocuments to be pruned during the merge. When the follower requests the sequencenumber for a pruned operation, the process will fail due to the operation missingon the leader.This scenario is not possible in an append-only workflow. As documents are neverdeleted or updated, the underlying operation will not be pruned.Elasticsearch attempts to mitigate this potential issue for update workflows usinga Lucene feature called soft deletes. When a document is updated or deleted, theunderlying operation is retained in the Lucene index for a period of time. Thisperiod of time is governed by the `index.soft_deletes.retention_lease.period`setting which can be <<ccr-requirements,configured on the leader index>>.When a follower initiates the index following, it acquires a retention lease fromthe leader. This informs the leader that it should not allow a soft delete to bepruned until either the follower indicates that it has received the operation orthe lease expires. It is valuable to have monitoring in place to detect a followerreplication issue prior to the lease expiring so that the problem can be remediedbefore the follower falls fatally behind.==== Remedying a follower that has fallen behindIf a follower falls sufficiently behind a leader that it can no longer replicateoperations this can be detected in {kib} or by using the{ref}/ccr-get-follow-stats.html[get follow stats API]. It will be reported as a`indices[].fatal_exception`.In order to restart the follower, you must pause the following process, close theindex, and the create follower index again. For example:[source,console]----------------------------------------------------------------------POST /follower_index/_ccr/pause_followPOST /follower_index/_closePUT /follower_index/_ccr/follow?wait_for_active_shards=1{  "remote_cluster" : "remote_cluster",  "leader_index" : "leader_index"}----------------------------------------------------------------------Re-creating the follower index is a destructive action. All of the existing Lucenesegment files are deleted on the follower cluster. The<<remote-recovery, remote recovery>> process copies the Lucene segmentfiles from the leader again. After the follower index initializes, thefollowing process starts again.==== Terminating replicationYou can terminate replication with the{ref}/ccr-post-unfollow.html[unfollow API]. This API converts a follower indexto a regular (non-follower) index.
 |