repository-gcs.asciidoc 7.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222
  1. [[repository-gcs]]
  2. === Google Cloud Storage Repository Plugin
  3. The GCS repository plugin adds support for using the https://cloud.google.com/storage/[Google Cloud Storage]
  4. service as a repository for {ref}/modules-snapshots.html[Snapshot/Restore].
  5. [[repository-gcs-install]]
  6. [float]
  7. ==== Installation
  8. This plugin can be installed using the plugin manager:
  9. [source,sh]
  10. ----------------------------------------------------------------
  11. sudo bin/elasticsearch-plugin install repository-gcs
  12. ----------------------------------------------------------------
  13. NOTE: The plugin requires new permission to be installed in order to work
  14. The plugin must be installed on every node in the cluster, and each node must
  15. be restarted after installation.
  16. This plugin can be downloaded for <<plugin-management-custom-url,offline install>> from
  17. {plugin_url}/repository-gcs/repository-gcs-{version}.zip.
  18. [[repository-gcs-remove]]
  19. [float]
  20. ==== Removal
  21. The plugin can be removed with the following command:
  22. [source,sh]
  23. ----------------------------------------------------------------
  24. sudo bin/elasticsearch-plugin remove repository-gcs
  25. ----------------------------------------------------------------
  26. The node must be stopped before removing the plugin.
  27. [[repository-gcs-usage]]
  28. ==== Getting started
  29. The plugin uses the https://cloud.google.com/storage/docs/json_api/[Google Cloud Storage JSON API] (v1)
  30. to connect to the Storage service. If this is the first time you use Google Cloud Storage, you first
  31. need to connect to the https://console.cloud.google.com/[Google Cloud Platform Console] and create a new
  32. project. Once your project is created, you must enable the Cloud Storage Service for your project.
  33. [[repository-gcs-creating-bucket]]
  34. ===== Creating a Bucket
  35. Google Cloud Storage service uses the concept of https://cloud.google.com/storage/docs/key-terms[Bucket]
  36. as a container for all the data. Buckets are usually created using the
  37. https://console.cloud.google.com/[Google Cloud Platform Console]. The plugin will not automatically
  38. create buckets.
  39. To create a new bucket:
  40. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  41. 2. Select your project
  42. 3. Got to the https://console.cloud.google.com/storage/browser[Storage Browser]
  43. 4. Click the "Create Bucket" button
  44. 5. Enter a the name of the new bucket
  45. 6. Select a storage class
  46. 7. Select a location
  47. 8. Click the "Create" button
  48. The bucket should now be created.
  49. [[repository-gcs-service-authentication]]
  50. ===== Service Authentication
  51. The plugin supports two authentication modes:
  52. * the built-in <<repository-gcs-using-compute-engine, Compute Engine authentication>>. This mode is
  53. recommended if your elasticsearch node is running on a Compute Engine virtual machine.
  54. * the <<repository-gcs-using-service-account, Service Account>> authentication mode.
  55. [[repository-gcs-using-compute-engine]]
  56. ===== Using Compute Engine
  57. When running on Compute Engine, the plugin use the Google's built-in authentication mechanism to
  58. authenticate on the Storage service. Compute Engine virtual machines are usually associated to a
  59. default service account. This service account can be found in the VM instance details in the
  60. https://console.cloud.google.com/compute/[Compute Engine console].
  61. To indicate that a repository should use the built-in authentication,
  62. the repository `service_account` setting must be set to `_default_`:
  63. [source,js]
  64. ----
  65. PUT _snapshot/my_gcs_repository_on_compute_engine
  66. {
  67. "type": "gcs",
  68. "settings": {
  69. "bucket": "my_bucket",
  70. "service_account": "_default_"
  71. }
  72. }
  73. ----
  74. // CONSOLE
  75. // TEST[skip:we don't have gcs setup while testing this]
  76. NOTE: The Compute Engine VM must be allowed to use the Storage service. This can be done only at VM
  77. creation time, when "Storage" access can be configured to "Read/Write" permission. Check your
  78. instance details at the section "Cloud API access scopes".
  79. [[repository-gcs-using-service-account]]
  80. ===== Using a Service Account
  81. If your elasticsearch node is not running on Compute Engine, or if you don't want to use Google
  82. built-in authentication mechanism, you can authenticate on the Storage service using a
  83. https://cloud.google.com/iam/docs/overview#service_account[Service Account] file.
  84. To create a service account file:
  85. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  86. 2. Select your project
  87. 3. Got to the https://console.cloud.google.com/permissions[Permission] tab
  88. 4. Select the https://console.cloud.google.com/permissions/serviceaccounts[Service Accounts] tab
  89. 5. Click on "Create service account"
  90. 6. Once created, select the new service account and download a JSON key file
  91. A service account file looks like this:
  92. [source,js]
  93. ----
  94. {
  95. "type": "service_account",
  96. "project_id": "your-project-id",
  97. "private_key_id": "...",
  98. "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  99. "client_email": "service-account-for-your-repository@your-project-id.iam.gserviceaccount.com",
  100. "client_id": "...",
  101. "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  102. "token_uri": "https://accounts.google.com/o/oauth2/token",
  103. "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  104. "client_x509_cert_url": "..."
  105. }
  106. ----
  107. // NOTCONSOLE
  108. This file must be copied in the `config` directory of the elasticsearch installation and on
  109. every node of the cluster.
  110. To indicate that a repository should use a service account file:
  111. [source,js]
  112. ----
  113. PUT _snapshot/my_gcs_repository
  114. {
  115. "type": "gcs",
  116. "settings": {
  117. "bucket": "my_bucket",
  118. "service_account": "service_account.json"
  119. }
  120. }
  121. ----
  122. // CONSOLE
  123. // TEST[skip:we don't have gcs setup while testing this]
  124. [[repository-gcs-bucket-permission]]
  125. ===== Set Bucket Permission
  126. The service account used to access the bucket must have the "Writer" access to the bucket:
  127. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  128. 2. Select your project
  129. 3. Got to the https://console.cloud.google.com/storage/browser[Storage Browser]
  130. 4. Select the bucket and "Edit bucket permission"
  131. 5. The service account must be configured as a "User" with "Writer" access
  132. [[repository-gcs-repository]]
  133. ==== Create a Repository
  134. Once everything is installed and every node is started, you can create a new repository that
  135. uses Google Cloud Storage to store snapshots:
  136. [source,js]
  137. ----
  138. PUT _snapshot/my_gcs_repository
  139. {
  140. "type": "gcs",
  141. "settings": {
  142. "bucket": "my_bucket",
  143. "service_account": "service_account.json"
  144. }
  145. }
  146. ----
  147. // CONSOLE
  148. // TEST[skip:we don't have gcs setup while testing this]
  149. The following settings are supported:
  150. `bucket`::
  151. The name of the bucket to be used for snapshots. (Mandatory)
  152. `service_account`::
  153. The service account to use. It can be a relative path to a service account JSON file
  154. or the value `_default_` that indicate to use built-in Compute Engine service account.
  155. `base_path`::
  156. Specifies the path within bucket to repository data. Defaults to
  157. the root of the bucket.
  158. `chunk_size`::
  159. Big files can be broken down into chunks during snapshotting if needed.
  160. The chunk size can be specified in bytes or by using size value notation,
  161. i.e. `1g`, `10m`, `5k`. Defaults to `100m`.
  162. `compress`::
  163. When set to `true` metadata files are stored in compressed format. This
  164. setting doesn't affect index files that are already compressed by default.
  165. Defaults to `false`.
  166. `application_name`::
  167. Name used by the plugin when it uses the Google Cloud JSON API. Setting
  168. a custom name can be useful to authenticate your cluster when requests
  169. statistics are logged in the Google Cloud Platform. Default to `repository-gcs`