|
@@ -0,0 +1,275 @@
|
|
|
+[role="xpack"]
|
|
|
+[[ccr-disaster-recovery-bi-directional-tutorial]]
|
|
|
+=== Tutorial: Disaster recovery based on bi-directional {ccr}
|
|
|
+++++
|
|
|
+<titleabbrev>Bi-directional disaster recovery</titleabbrev>
|
|
|
+++++
|
|
|
+
|
|
|
+////
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+PUT _data_stream/logs-generic-default
|
|
|
+----
|
|
|
+// TESTSETUP
|
|
|
+
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+DELETE /_data_stream/*
|
|
|
+----
|
|
|
+// TEARDOWN
|
|
|
+////
|
|
|
+
|
|
|
+Learn how to set up disaster recovery between two clusters based on
|
|
|
+bi-directional {ccr}. The following tutorial is designed for data streams which support
|
|
|
+<<update-docs-in-a-data-stream-by-query,update by query>> and <<delete-docs-in-a-data-stream-by-query,delete by query>>. You can only perform these actions on the leader index.
|
|
|
+
|
|
|
+This tutorial works with {ls} as the source of ingestion. It takes advantage of a {ls} feature where {logstash-ref}/plugins-outputs-elasticsearch.html[the {ls} output to {es}] can be load balanced across an array of hosts specified. {beats} and {agents} currently do not
|
|
|
+support multiple outputs. It should also be possible to set up a proxy
|
|
|
+(load balancer) to redirect traffic without {ls} in this tutorial.
|
|
|
+
|
|
|
+* Setting up a remote cluster on `clusterA` and `clusterB`.
|
|
|
+* Setting up bi-directional cross-cluster replication with exclusion patterns.
|
|
|
+* Setting up {ls} with multiple hosts to allow automatic load balancing and switching during disasters.
|
|
|
+
|
|
|
+image::images/ccr-bi-directional-disaster-recovery.png[Bi-directional cross cluster replication failover and failback]
|
|
|
+
|
|
|
+[[ccr-tutorial-initial-setup]]
|
|
|
+==== Initial setup
|
|
|
+. Set up a remote cluster on both clusters.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+### On cluster A ###
|
|
|
+PUT _cluster/settings
|
|
|
+{
|
|
|
+ "persistent": {
|
|
|
+ "cluster": {
|
|
|
+ "remote": {
|
|
|
+ "clusterB": {
|
|
|
+ "mode": "proxy",
|
|
|
+ "skip_unavailable": true,
|
|
|
+ "server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",
|
|
|
+ "proxy_socket_connections": 18,
|
|
|
+ "proxy_address": "clusterb.es.region-b.gcp.elastic-cloud.com:9400"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+### On cluster B ###
|
|
|
+PUT _cluster/settings
|
|
|
+{
|
|
|
+ "persistent": {
|
|
|
+ "cluster": {
|
|
|
+ "remote": {
|
|
|
+ "clusterA": {
|
|
|
+ "mode": "proxy",
|
|
|
+ "skip_unavailable": true,
|
|
|
+ "server_name": "clustera.es.region-a.gcp.elastic-cloud.com",
|
|
|
+ "proxy_socket_connections": 18,
|
|
|
+ "proxy_address": "clustera.es.region-a.gcp.elastic-cloud.com:9400"
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[setup:host]
|
|
|
+// TEST[s/"server_name": "clustera.es.region-a.gcp.elastic-cloud.com",//]
|
|
|
+// TEST[s/"server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",//]
|
|
|
+// TEST[s/"proxy_socket_connections": 18,//]
|
|
|
+// TEST[s/clustera.es.region-a.gcp.elastic-cloud.com:9400/\${transport_host}/]
|
|
|
+// TEST[s/clusterb.es.region-b.gcp.elastic-cloud.com:9400/\${transport_host}/]
|
|
|
+
|
|
|
+. Set up bi-directional cross-cluster replication.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+### On cluster A ###
|
|
|
+PUT /_ccr/auto_follow/logs-generic-default
|
|
|
+{
|
|
|
+ "remote_cluster": "clusterB",
|
|
|
+ "leader_index_patterns": [
|
|
|
+ ".ds-logs-generic-default-20*"
|
|
|
+ ],
|
|
|
+ "leader_index_exclusion_patterns":"{{leader_index}}-replicated_from_clustera",
|
|
|
+ "follow_index_pattern": "{{leader_index}}-replicated_from_clusterb"
|
|
|
+}
|
|
|
+
|
|
|
+### On cluster B ###
|
|
|
+PUT /_ccr/auto_follow/logs-generic-default
|
|
|
+{
|
|
|
+ "remote_cluster": "clusterA",
|
|
|
+ "leader_index_patterns": [
|
|
|
+ ".ds-logs-generic-default-20*"
|
|
|
+ ],
|
|
|
+ "leader_index_exclusion_patterns":"{{leader_index}}-replicated_from_clusterb",
|
|
|
+ "follow_index_pattern": "{{leader_index}}-replicated_from_clustera"
|
|
|
+}
|
|
|
+----
|
|
|
+// TEST[setup:remote_cluster]
|
|
|
+// TEST[s/clusterA/remote_cluster/]
|
|
|
+// TEST[s/clusterB/remote_cluster/]
|
|
|
++
|
|
|
+IMPORTANT: Existing data on the cluster will not be replicated by
|
|
|
+`_ccr/auto_follow` even though the patterns may match. This function will only
|
|
|
+replicate newly created backing indices (as part of the data stream).
|
|
|
++
|
|
|
+IMPORTANT: Use `leader_index_exclusion_patterns` to avoid recursion.
|
|
|
++
|
|
|
+TIP: `follow_index_pattern` allows lowercase characters only.
|
|
|
++
|
|
|
+TIP: This step cannot be executed via the {kib} UI due to the lack of an exclusion
|
|
|
+pattern in the UI. Use the API in this step.
|
|
|
+
|
|
|
+. Set up the {ls} configuration file.
|
|
|
++
|
|
|
+This example uses the input generator to demonstrate the document
|
|
|
+count in the clusters. Reconfigure this section
|
|
|
+to suit your own use case.
|
|
|
++
|
|
|
+[source,logstash]
|
|
|
+----
|
|
|
+### On Logstash server ###
|
|
|
+### This is a logstash config file ###
|
|
|
+input {
|
|
|
+ generator{
|
|
|
+ message => 'Hello World'
|
|
|
+ count => 100
|
|
|
+ }
|
|
|
+}
|
|
|
+output {
|
|
|
+ elasticsearch {
|
|
|
+ hosts => ["https://clustera.es.region-a.gcp.elastic-cloud.com:9243","https://clusterb.es.region-b.gcp.elastic-cloud.com:9243"]
|
|
|
+ user => "logstash-user"
|
|
|
+ password => "same_password_for_both_clusters"
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
++
|
|
|
+IMPORTANT: The key point is that when `cluster A` is down, all traffic will be
|
|
|
+automatically redirected to `cluster B`. Once `cluster A` comes back, traffic
|
|
|
+is automatically redirected back to `cluster A` again. This is achieved by the
|
|
|
+option `hosts` where multiple ES cluster endpoints are specified in the
|
|
|
+array `[clusterA, clusterB]`.
|
|
|
++
|
|
|
+TIP: Set up the same password for the same user on both clusters to use this load-balancing feature.
|
|
|
+
|
|
|
+. Start {ls} with the earlier configuration file.
|
|
|
++
|
|
|
+[source,sh]
|
|
|
+----
|
|
|
+### On Logstash server ###
|
|
|
+bin/logstash -f multiple_hosts.conf
|
|
|
+----
|
|
|
+
|
|
|
+. Observe document counts in data streams.
|
|
|
++
|
|
|
+The setup creates a data stream named `logs-generic-default` on each of the clusters. {ls} will write 50% of the documents to `cluster A` and 50% of the documents to `cluster B` when both clusters are up.
|
|
|
++
|
|
|
+Bi-directional {ccr} will create one more data stream on each of the clusters
|
|
|
+with the `-replication_from_cluster{a|b}` suffix. At the end of this step:
|
|
|
++
|
|
|
+* data streams on cluster A contain:
|
|
|
+** 50 documents in `logs-generic-default-replicated_from_clusterb`
|
|
|
+** 50 documents in `logs-generic-default`
|
|
|
+* data streams on cluster B contain:
|
|
|
+** 50 documents in `logs-generic-default-replicated_from_clustera`
|
|
|
+** 50 documents in `logs-generic-default`
|
|
|
+
|
|
|
+. Queries should be set up to search across both data streams.
|
|
|
+A query on `logs*`, on either of the clusters, returns 100
|
|
|
+hits in total.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+GET logs*/_search?size=0
|
|
|
+----
|
|
|
+
|
|
|
+
|
|
|
+==== Failover when `clusterA` is down
|
|
|
+. You can simulate this by shutting down either of the clusters. Let's shut down
|
|
|
+`cluster A` in this tutorial.
|
|
|
+. Start {ls} with the same configuration file. (This step is not required in real
|
|
|
+use cases where {ls} ingests continuously.)
|
|
|
++
|
|
|
+[source,sh]
|
|
|
+----
|
|
|
+### On Logstash server ###
|
|
|
+bin/logstash -f multiple_hosts.conf
|
|
|
+----
|
|
|
+
|
|
|
+. Observe all {ls} traffic will be redirected to `cluster B` automatically.
|
|
|
++
|
|
|
+TIP: You should also redirect all search traffic to the `clusterB` cluster during this time.
|
|
|
+
|
|
|
+. The two data streams on `cluster B` now contain a different number of documents.
|
|
|
++
|
|
|
+* data streams on cluster A (down)
|
|
|
+** 50 documents in `logs-generic-default-replicated_from_clusterb`
|
|
|
+** 50 documents in `logs-generic-default`
|
|
|
+* data streams On cluster B (up)
|
|
|
+** 50 documents in `logs-generic-default-replicated_from_clustera`
|
|
|
+** 150 documents in `logs-generic-default`
|
|
|
+
|
|
|
+
|
|
|
+==== Failback when `clusterA` comes back
|
|
|
+. You can simulate this by turning `cluster A` back on.
|
|
|
+. Data ingested to `cluster B` during `cluster A` 's downtime will be
|
|
|
+automatically replicated.
|
|
|
++
|
|
|
+* data streams on cluster A
|
|
|
+** 150 documents in `logs-generic-default-replicated_from_clusterb`
|
|
|
+** 50 documents in `logs-generic-default`
|
|
|
+* data streams on cluster B
|
|
|
+** 50 documents in `logs-generic-default-replicated_from_clustera`
|
|
|
+** 150 documents in `logs-generic-default`
|
|
|
+
|
|
|
+. If you have {ls} running at this time, you will also observe traffic is
|
|
|
+sent to both clusters.
|
|
|
+
|
|
|
+==== Perform update or delete by query
|
|
|
+It is possible to update or delete the documents but you can only perform these actions on the leader index.
|
|
|
+
|
|
|
+. First identify which backing index contains the document you want to update.
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+### On either of the cluster ###
|
|
|
+GET logs-generic-default*/_search?filter_path=hits.hits._index
|
|
|
+{
|
|
|
+"query": {
|
|
|
+ "match": {
|
|
|
+ "event.sequence": "97"
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
++
|
|
|
+* If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clustera-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster A`.
|
|
|
+* If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clusterb-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster B`.
|
|
|
+* If the hits returns `"_index": ".ds-logs-generic-default-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on the same cluster where you performed the search query.
|
|
|
+
|
|
|
+. Perform the update (or delete) by query:
|
|
|
++
|
|
|
+[source,console]
|
|
|
+----
|
|
|
+### On the cluster identified from the previous step ###
|
|
|
+POST logs-generic-default/_update_by_query
|
|
|
+{
|
|
|
+ "query": {
|
|
|
+ "match": {
|
|
|
+ "event.sequence": "97"
|
|
|
+ }
|
|
|
+ },
|
|
|
+ "script": {
|
|
|
+ "source": "ctx._source.event.original = params.new_event",
|
|
|
+ "lang": "painless",
|
|
|
+ "params": {
|
|
|
+ "new_event": "FOOBAR"
|
|
|
+ }
|
|
|
+ }
|
|
|
+}
|
|
|
+----
|
|
|
++
|
|
|
+TIP: If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see <<ccr-index-soft-deletes-retention-period, index.soft_deletes.retention_lease.period>> for more details.
|