123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275 |
- [role="xpack"]
- [[ccr-disaster-recovery-bi-directional-tutorial]]
- === Tutorial: Disaster recovery based on bi-directional {ccr}
- ++++
- <titleabbrev>Bi-directional disaster recovery</titleabbrev>
- ++++
- ////
- [source,console]
- ----
- PUT _data_stream/logs-generic-default
- ----
- // TESTSETUP
- [source,console]
- ----
- DELETE /_data_stream/*
- ----
- // TEARDOWN
- ////
- Learn how to set up disaster recovery between two clusters based on
- bi-directional {ccr}. The following tutorial is designed for data streams which support
- <<update-docs-in-a-data-stream-by-query,update by query>> and <<delete-docs-in-a-data-stream-by-query,delete by query>>. You can only perform these actions on the leader index.
- This tutorial works with {ls} as the source of ingestion. It takes advantage of a {ls} feature where {logstash-ref}/plugins-outputs-elasticsearch.html[the {ls} output to {es}] can be load balanced across an array of hosts specified. {beats} and {agents} currently do not
- support multiple outputs. It should also be possible to set up a proxy
- (load balancer) to redirect traffic without {ls} in this tutorial.
- * Setting up a remote cluster on `clusterA` and `clusterB`.
- * Setting up bi-directional cross-cluster replication with exclusion patterns.
- * Setting up {ls} with multiple hosts to allow automatic load balancing and switching during disasters.
- image::images/ccr-bi-directional-disaster-recovery.png[Bi-directional cross cluster replication failover and failback]
- [[ccr-tutorial-initial-setup]]
- ==== Initial setup
- . Set up a remote cluster on both clusters.
- +
- [source,console]
- ----
- ### On cluster A ###
- PUT _cluster/settings
- {
- "persistent": {
- "cluster": {
- "remote": {
- "clusterB": {
- "mode": "proxy",
- "skip_unavailable": true,
- "server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",
- "proxy_socket_connections": 18,
- "proxy_address": "clusterb.es.region-b.gcp.elastic-cloud.com:9400"
- }
- }
- }
- }
- }
- ### On cluster B ###
- PUT _cluster/settings
- {
- "persistent": {
- "cluster": {
- "remote": {
- "clusterA": {
- "mode": "proxy",
- "skip_unavailable": true,
- "server_name": "clustera.es.region-a.gcp.elastic-cloud.com",
- "proxy_socket_connections": 18,
- "proxy_address": "clustera.es.region-a.gcp.elastic-cloud.com:9400"
- }
- }
- }
- }
- }
- ----
- // TEST[setup:host]
- // TEST[s/"server_name": "clustera.es.region-a.gcp.elastic-cloud.com",//]
- // TEST[s/"server_name": "clusterb.es.region-b.gcp.elastic-cloud.com",//]
- // TEST[s/"proxy_socket_connections": 18,//]
- // TEST[s/clustera.es.region-a.gcp.elastic-cloud.com:9400/\${transport_host}/]
- // TEST[s/clusterb.es.region-b.gcp.elastic-cloud.com:9400/\${transport_host}/]
- . Set up bi-directional cross-cluster replication.
- +
- [source,console]
- ----
- ### On cluster A ###
- PUT /_ccr/auto_follow/logs-generic-default
- {
- "remote_cluster": "clusterB",
- "leader_index_patterns": [
- ".ds-logs-generic-default-20*"
- ],
- "leader_index_exclusion_patterns":"*-replicated_from_clustera",
- "follow_index_pattern": "{{leader_index}}-replicated_from_clusterb"
- }
- ### On cluster B ###
- PUT /_ccr/auto_follow/logs-generic-default
- {
- "remote_cluster": "clusterA",
- "leader_index_patterns": [
- ".ds-logs-generic-default-20*"
- ],
- "leader_index_exclusion_patterns":"*-replicated_from_clusterb",
- "follow_index_pattern": "{{leader_index}}-replicated_from_clustera"
- }
- ----
- // TEST[setup:remote_cluster]
- // TEST[s/clusterA/remote_cluster/]
- // TEST[s/clusterB/remote_cluster/]
- +
- IMPORTANT: Existing data on the cluster will not be replicated by
- `_ccr/auto_follow` even though the patterns may match. This function will only
- replicate newly created backing indices (as part of the data stream).
- +
- IMPORTANT: Use `leader_index_exclusion_patterns` to avoid recursion.
- +
- TIP: `follow_index_pattern` allows lowercase characters only.
- +
- TIP: This step cannot be executed via the {kib} UI due to the lack of an exclusion
- pattern in the UI. Use the API in this step.
- . Set up the {ls} configuration file.
- +
- This example uses the input generator to demonstrate the document
- count in the clusters. Reconfigure this section
- to suit your own use case.
- +
- [source,logstash]
- ----
- ### On Logstash server ###
- ### This is a logstash config file ###
- input {
- generator{
- message => 'Hello World'
- count => 100
- }
- }
- output {
- elasticsearch {
- hosts => ["https://clustera.es.region-a.gcp.elastic-cloud.com:9243","https://clusterb.es.region-b.gcp.elastic-cloud.com:9243"]
- user => "logstash-user"
- password => "same_password_for_both_clusters"
- }
- }
- ----
- +
- IMPORTANT: The key point is that when `cluster A` is down, all traffic will be
- automatically redirected to `cluster B`. Once `cluster A` comes back, traffic
- is automatically redirected back to `cluster A` again. This is achieved by the
- option `hosts` where multiple ES cluster endpoints are specified in the
- array `[clusterA, clusterB]`.
- +
- TIP: Set up the same password for the same user on both clusters to use this load-balancing feature.
- . Start {ls} with the earlier configuration file.
- +
- [source,sh]
- ----
- ### On Logstash server ###
- bin/logstash -f multiple_hosts.conf
- ----
- . Observe document counts in data streams.
- +
- The setup creates a data stream named `logs-generic-default` on each of the clusters. {ls} will write 50% of the documents to `cluster A` and 50% of the documents to `cluster B` when both clusters are up.
- +
- Bi-directional {ccr} will create one more data stream on each of the clusters
- with the `-replication_from_cluster{a|b}` suffix. At the end of this step:
- +
- * data streams on cluster A contain:
- ** 50 documents in `logs-generic-default-replicated_from_clusterb`
- ** 50 documents in `logs-generic-default`
- * data streams on cluster B contain:
- ** 50 documents in `logs-generic-default-replicated_from_clustera`
- ** 50 documents in `logs-generic-default`
- . Queries should be set up to search across both data streams.
- A query on `logs*`, on either of the clusters, returns 100
- hits in total.
- +
- [source,console]
- ----
- GET logs*/_search?size=0
- ----
- ==== Failover when `clusterA` is down
- . You can simulate this by shutting down either of the clusters. Let's shut down
- `cluster A` in this tutorial.
- . Start {ls} with the same configuration file. (This step is not required in real
- use cases where {ls} ingests continuously.)
- +
- [source,sh]
- ----
- ### On Logstash server ###
- bin/logstash -f multiple_hosts.conf
- ----
- . Observe all {ls} traffic will be redirected to `cluster B` automatically.
- +
- TIP: You should also redirect all search traffic to the `clusterB` cluster during this time.
- . The two data streams on `cluster B` now contain a different number of documents.
- +
- * data streams on cluster A (down)
- ** 50 documents in `logs-generic-default-replicated_from_clusterb`
- ** 50 documents in `logs-generic-default`
- * data streams On cluster B (up)
- ** 50 documents in `logs-generic-default-replicated_from_clustera`
- ** 150 documents in `logs-generic-default`
- ==== Failback when `clusterA` comes back
- . You can simulate this by turning `cluster A` back on.
- . Data ingested to `cluster B` during `cluster A` 's downtime will be
- automatically replicated.
- +
- * data streams on cluster A
- ** 150 documents in `logs-generic-default-replicated_from_clusterb`
- ** 50 documents in `logs-generic-default`
- * data streams on cluster B
- ** 50 documents in `logs-generic-default-replicated_from_clustera`
- ** 150 documents in `logs-generic-default`
- . If you have {ls} running at this time, you will also observe traffic is
- sent to both clusters.
- ==== Perform update or delete by query
- It is possible to update or delete the documents but you can only perform these actions on the leader index.
- . First identify which backing index contains the document you want to update.
- +
- [source,console]
- ----
- ### On either of the cluster ###
- GET logs-generic-default*/_search?filter_path=hits.hits._index
- {
- "query": {
- "match": {
- "event.sequence": "97"
- }
- }
- }
- ----
- +
- * If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clustera-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster A`.
- * If the hits returns `"_index": ".ds-logs-generic-default-replicated_from_clusterb-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster B`.
- * If the hits returns `"_index": ".ds-logs-generic-default-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on the same cluster where you performed the search query.
- . Perform the update (or delete) by query:
- +
- [source,console]
- ----
- ### On the cluster identified from the previous step ###
- POST logs-generic-default/_update_by_query
- {
- "query": {
- "match": {
- "event.sequence": "97"
- }
- },
- "script": {
- "source": "ctx._source.event.original = params.new_event",
- "lang": "painless",
- "params": {
- "new_event": "FOOBAR"
- }
- }
- }
- ----
- +
- TIP: If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see <<ccr-index-soft-deletes-retention-period, index.soft_deletes.retention_lease.period>> for more details.
|