lqb
/
elasticsearch
mirror of https://gitee.com/mirrors/elasticsearch.git


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275
							[role="xpack"]
[[ccr-disaster-recovery-bi-directional-tutorial]]
=== Tutorial: Disaster recovery based on bi-directional {ccr}
++++
<titleabbrev>Bi-directional disaster recovery</titleabbrev>
++++

////
[source,console]
----
PUT _data_stream/logs-generic-default
----
// TESTSETUP

[source,console]
----
DELETE /_data_stream/*
----
// TEARDOWN
////

Learn how to set up disaster recovery between two clusters based on
bi-directional {ccr}. The following tutorial is designed for data streams which support
<<update-docs-in-a-data-stream-by-query,update by query>> and <<delete-docs-in-a-data-stream-by-query,delete by query>>. You can only perform these actions on the leader index.

This tutorial works with {ls} as the source of ingestion. It takes advantage of a {ls} feature where {logstash-ref}/plugins-outputs-elasticsearch.html[the {ls} output to {es}] can be load balanced across an array of hosts specified. {beats} and {agents} currently do not
support multiple outputs. It should also be possible to set up a proxy
(load balancer) to redirect traffic without {ls} in this tutorial.

* Setting up a remote cluster on `clusterA` and `clusterB`.
* Setting up bi-directional cross-cluster replication with exclusion patterns.
* Setting up {ls} with multiple hosts to allow automatic load balancing and switching during disasters.

image::images/ccr-bi-directional-disaster-recovery.png[Bi-directional cross cluster replication failover and failback]

[[ccr-tutorial-initial-setup]]
==== Initial setup
. Set up a remote cluster on both clusters.
+
[source,console]
----
### On cluster A ###
PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "clusterB": {
          "mode": "proxy",
          "skip_unavailable": true,
          "server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",
          "proxy_socket_connections": 18,
          "proxy_address": "clusterb.es.region-b.gcp.elastic-cloud.com:9400"
        }
      }
    }
  }
}
### On cluster B ###
PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "clusterA": {
          "mode": "proxy",
          "skip_unavailable": true,
          "server_name": "clustera.es.region-a.gcp.elastic-cloud.com",
          "proxy_socket_connections": 18,
          "proxy_address": "clustera.es.region-a.gcp.elastic-cloud.com:9400"
        }
      }
    }
  }
}
----
// TEST[setup:host]
// TEST[s/"server_name": "clustera.es.region-a.gcp.elastic-cloud.com",//]
// TEST[s/"server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",//]
// TEST[s/"proxy_socket_connections": 18,//]
// TEST[s/clustera.es.region-a.gcp.elastic-cloud.com:9400/\${transport_host}/]
// TEST[s/clusterb.es.region-b.gcp.elastic-cloud.com:9400/\${transport_host}/]

. Set up bi-directional cross-cluster replication.
+
[source,console]
----
### On cluster A ###
PUT /_ccr/auto_follow/logs-generic-default
{
  "remote_cluster": "clusterB",
  "leader_index_patterns": [
    ".ds-logs-generic-default-20*"
  ],
  "leader_index_exclusion_patterns":"*-replicated_from_clustera",
  "follow_index_pattern": "{{leader_index}}-replicated_from_clusterb"
}

### On cluster B ###
PUT /_ccr/auto_follow/logs-generic-default
{
  "remote_cluster": "clusterA",
  "leader_index_patterns": [
    ".ds-logs-generic-default-20*"
  ],
  "leader_index_exclusion_patterns":"*-replicated_from_clusterb",
  "follow_index_pattern": "{{leader_index}}-replicated_from_clustera"
}
----
// TEST[setup:remote_cluster]
// TEST[s/clusterA/remote_cluster/]
// TEST[s/clusterB/remote_cluster/]
+
IMPORTANT: Existing data on the cluster will not be replicated by
`_ccr/auto_follow` even though the patterns may match. This function will only
replicate newly created backing indices (as part of the data stream).
+
IMPORTANT: Use `leader_index_exclusion_patterns` to avoid recursion.
+
TIP: `follow_index_pattern` allows lowercase characters only.
+
TIP: This step cannot be executed via the {kib} UI due to the lack of an exclusion
pattern in the UI. Use the API in this step.

. Set up the {ls} configuration file.
+
This example uses the input generator to demonstrate the document
count in the clusters. Reconfigure this section
to suit your own use case.
+
[source,logstash]
----
### On Logstash server ###
### This is a logstash config file ###
input {
  generator{
    message => 'Hello World'
    count => 100
  }
}
output {
  elasticsearch {
    hosts => ["https://clustera.es.region-a.gcp.elastic-cloud.com:9243","https://clusterb.es.region-b.gcp.elastic-cloud.com:9243"]
    user => "logstash-user"
    password => "same_password_for_both_clusters"
  }
}
----
+
IMPORTANT: The key point is that when `cluster A` is down, all traffic will be
automatically redirected to `cluster B`. Once `cluster A` comes back, traffic
is automatically redirected back to `cluster A` again. This is achieved by the
option `hosts` where multiple ES cluster endpoints are specified in the
array `[clusterA, clusterB]`.
+
TIP: Set up the same password for the same user on both clusters to use this load-balancing feature.

. Start {ls} with the earlier configuration file.
+
[source,sh]
----
### On Logstash server ###
bin/logstash -f multiple_hosts.conf
----

. Observe document counts in data streams.
+
The setup creates a data stream named `logs-generic-default` on each of the clusters. {ls} will write 50% of the documents to `cluster A` and 50% of the documents to `cluster B` when both clusters are up.
+
Bi-directional {ccr} will create one more data stream on each of the clusters
with the `-replication_from_cluster{a|b}` suffix. At the end of this step:
+
* data streams on cluster A contain:
** 50 documents in `logs-generic-default-replicated_from_clusterb`
** 50 documents in `logs-generic-default`
* data streams on cluster B contain:
** 50 documents in `logs-generic-default-replicated_from_clustera`
** 50 documents in `logs-generic-default`

. Queries should be set up to search across both data streams.
A query on `logs*`, on either of the clusters, returns 100
hits in total.
+
[source,console]
----
GET logs*/_search?size=0
----


==== Failover when `clusterA` is down
. You can simulate this by shutting down either of the clusters. Let's shut down
`cluster A` in this tutorial.
. Start {ls} with the same configuration file. (This step is not required in real
use cases where {ls} ingests continuously.)
+
[source,sh]
----
### On Logstash server ###
bin/logstash -f multiple_hosts.conf
----

. Observe all {ls} traffic will be redirected to `cluster B` automatically.
+
TIP: You should also redirect all search traffic to the `clusterB` cluster during this time.

. The two data streams on `cluster B` now contain a different number of documents.
+
* data streams on cluster A (down)
** 50 documents in `logs-generic-default-replicated_from_clusterb`
** 50 documents in `logs-generic-default`
* data streams On cluster B (up)
** 50 documents in `logs-generic-default-replicated_from_clustera`
** 150 documents in `logs-generic-default`


==== Failback when `clusterA` comes back
. You can simulate this by turning `cluster A` back on.
. Data ingested to `cluster B` during `cluster A` 's downtime will be
automatically replicated.
+
* data streams on cluster A
** 150 documents in `logs-generic-default-replicated_from_clusterb`
** 50 documents in `logs-generic-default`
* data streams on cluster B
** 50 documents in `logs-generic-default-replicated_from_clustera`
** 150 documents in `logs-generic-default`

. If you have {ls} running at this time, you will also observe traffic is
sent to both clusters.

==== Perform update or delete by query
It is possible to update or delete the documents but you can only perform these actions on the leader index.

. First identify which backing index contains the document you want to update.
+
[source,console]
----
### On either of the cluster ###
GET logs-generic-default*/_search?filter_path=hits.hits._index
{
"query": {
    "match": {
      "event.sequence": "97"
    }
  }
}
----
+
* If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clustera-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster A`.
* If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clusterb-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster B`.
* If the hits returns `"_index": ".ds-logs-generic-default-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on the same cluster where you performed the search query.

. Perform the update (or delete) by query:
+
[source,console]
----
### On the cluster identified from the previous step ###
POST logs-generic-default/_update_by_query
{
  "query": {
    "match": {
      "event.sequence": "97"
    }
  },
  "script": {
    "source": "ctx._source.event.original = params.new_event",
    "lang": "painless",
    "params": {
      "new_event": "FOOBAR"
    }
  }
}
----
+
TIP: If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see <<ccr-index-soft-deletes-retention-period, index.soft_deletes.retention_lease.period>> for more details.