repository-gcs.asciidoc 9.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255
  1. [[repository-gcs]]
  2. === Google Cloud Storage Repository Plugin
  3. The GCS repository plugin adds support for using the https://cloud.google.com/storage/[Google Cloud Storage]
  4. service as a repository for {ref}/modules-snapshots.html[Snapshot/Restore].
  5. :plugin_name: repository-gcs
  6. include::install_remove.asciidoc[]
  7. [[repository-gcs-usage]]
  8. ==== Getting started
  9. The plugin uses the https://cloud.google.com/storage/docs/json_api/[Google Cloud Storage JSON API] (v1)
  10. to connect to the Storage service. If this is the first time you use Google Cloud Storage, you first
  11. need to connect to the https://console.cloud.google.com/[Google Cloud Platform Console] and create a new
  12. project. Once your project is created, you must enable the Cloud Storage Service for your project.
  13. [[repository-gcs-creating-bucket]]
  14. ===== Creating a Bucket
  15. Google Cloud Storage service uses the concept of https://cloud.google.com/storage/docs/key-terms[Bucket]
  16. as a container for all the data. Buckets are usually created using the
  17. https://console.cloud.google.com/[Google Cloud Platform Console]. The plugin will not automatically
  18. create buckets.
  19. To create a new bucket:
  20. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  21. 2. Select your project
  22. 3. Go to the https://console.cloud.google.com/storage/browser[Storage Browser]
  23. 4. Click the "Create Bucket" button
  24. 5. Enter the name of the new bucket
  25. 6. Select a storage class
  26. 7. Select a location
  27. 8. Click the "Create" button
  28. The bucket should now be created.
  29. [[repository-gcs-service-authentication]]
  30. ===== Service Authentication
  31. The plugin supports two authentication modes:
  32. * The built-in <<repository-gcs-using-compute-engine, Compute Engine authentication>>. This mode is
  33. recommended if your Elasticsearch node is running on a Compute Engine virtual machine.
  34. * Specifying <<repository-gcs-using-service-account, Service Account>> credentials.
  35. [[repository-gcs-using-compute-engine]]
  36. ===== Using Compute Engine
  37. When running on Compute Engine, the plugin use Google's built-in authentication mechanism to
  38. authenticate on the Storage service. Compute Engine virtual machines are usually associated to a
  39. default service account. This service account can be found in the VM instance details in the
  40. https://console.cloud.google.com/compute/[Compute Engine console].
  41. This is the default authentication mode and requires no configuration.
  42. NOTE: The Compute Engine VM must be allowed to use the Storage service. This can be done only at VM
  43. creation time, when "Storage" access can be configured to "Read/Write" permission. Check your
  44. instance details at the section "Cloud API access scopes".
  45. [[repository-gcs-using-service-account]]
  46. ===== Using a Service Account
  47. If your Elasticsearch node is not running on Compute Engine, or if you don't want to use Google's
  48. built-in authentication mechanism, you can authenticate on the Storage service using a
  49. https://cloud.google.com/iam/docs/overview#service_account[Service Account] file.
  50. To create a service account file:
  51. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  52. 2. Select your project
  53. 3. Got to the https://console.cloud.google.com/permissions[Permission] tab
  54. 4. Select the https://console.cloud.google.com/permissions/serviceaccounts[Service Accounts] tab
  55. 5. Click on "Create service account"
  56. 6. Once created, select the new service account and download a JSON key file
  57. A service account file looks like this:
  58. [source,js]
  59. ----
  60. {
  61. "type": "service_account",
  62. "project_id": "your-project-id",
  63. "private_key_id": "...",
  64. "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  65. "client_email": "service-account-for-your-repository@your-project-id.iam.gserviceaccount.com",
  66. "client_id": "..."
  67. }
  68. ----
  69. // NOTCONSOLE
  70. This file must be stored in the {ref}/secure-settings.html[elasticsearch keystore], under a setting name
  71. of the form `gcs.client.NAME.credentials_file`, where `NAME` is the name of the client configuration.
  72. The default client name is `default`, but a different client name can be specified in repository
  73. settings using `client`.
  74. For example, if specifying the credentials file in the keystore under
  75. `gcs.client.my_alternate_client.credentials_file`, you can configure a repository to use these
  76. credentials like this:
  77. [source,js]
  78. ----
  79. PUT _snapshot/my_gcs_repository
  80. {
  81. "type": "gcs",
  82. "settings": {
  83. "bucket": "my_bucket",
  84. "client": "my_alternate_client"
  85. }
  86. }
  87. ----
  88. // CONSOLE
  89. // TEST[skip:we don't have gcs setup while testing this]
  90. The `credentials_file` settings are {ref}/secure-settings.html#reloadable-secure-settings[reloadable].
  91. After you reload the settings, the internal `gcs` clients, used to transfer the
  92. snapshot contents, will utilize the latest settings from the keystore.
  93. NOTE: In progress snapshot/restore jobs will not be preempted by a *reload*
  94. of the client's `credentials_file` settings. They will complete using the client
  95. as it was built when the operation started.
  96. [[repository-gcs-client]]
  97. ==== Client Settings
  98. The client used to connect to Google Cloud Storage has a number of settings available.
  99. Client setting names are of the form `gcs.client.CLIENT_NAME.SETTING_NAME` and specified
  100. inside `elasticsearch.yml`. The default client name looked up by a `gcs` repository is
  101. called `default`, but can be customized with the repository setting `client`.
  102. For example:
  103. [source,js]
  104. ----
  105. PUT _snapshot/my_gcs_repository
  106. {
  107. "type": "gcs",
  108. "settings": {
  109. "bucket": "my_bucket",
  110. "client": "my_alternate_client"
  111. }
  112. }
  113. ----
  114. // CONSOLE
  115. // TEST[skip:we don't have gcs setup while testing this]
  116. Some settings are sensitive and must be stored in the
  117. {ref}/secure-settings.html[elasticsearch keystore]. This is the case for the service account file:
  118. [source,sh]
  119. ----
  120. bin/elasticsearch-keystore add-file gcs.client.default.credentials_file
  121. ----
  122. The following are the available client settings. Those that must be stored in the keystore
  123. are marked as `Secure`.
  124. `credentials_file`::
  125. The service account file that is used to authenticate to the Google Cloud Storage service. (Secure)
  126. `endpoint`::
  127. The Google Cloud Storage service endpoint to connect to. This will be automatically
  128. determined by the Google Cloud Storage client but can be specified explicitly.
  129. `connect_timeout`::
  130. The timeout to establish a connection to the Google Cloud Storage service. The value should
  131. specify the unit. For example, a value of `5s` specifies a 5 second timeout. The value of `-1`
  132. corresponds to an infinite timeout. The default value is 20 seconds.
  133. `read_timeout`::
  134. The timeout to read data from an established connection. The value should
  135. specify the unit. For example, a value of `5s` specifies a 5 second timeout. The value of `-1`
  136. corresponds to an infinite timeout. The default value is 20 seconds.
  137. `application_name`::
  138. Name used by the client when it uses the Google Cloud Storage service. Setting
  139. a custom name can be useful to authenticate your cluster when requests
  140. statistics are logged in the Google Cloud Platform. Default to `repository-gcs`
  141. `project_id`::
  142. The Google Cloud project id. This will be automatically infered from the credentials file but
  143. can be specified explicitly. For example, it can be used to switch between projects when the
  144. same credentials are usable for both the production and the development projects.
  145. [[repository-gcs-repository]]
  146. ==== Repository Settings
  147. The `gcs` repository type supports a number of settings to customize how data
  148. is stored in Google Cloud Storage.
  149. These can be specified when creating the repository. For example:
  150. [source,js]
  151. ----
  152. PUT _snapshot/my_gcs_repository
  153. {
  154. "type": "gcs",
  155. "settings": {
  156. "bucket": "my_other_bucket",
  157. "base_path": "dev"
  158. }
  159. }
  160. ----
  161. // CONSOLE
  162. // TEST[skip:we don't have gcs set up while testing this]
  163. The following settings are supported:
  164. `bucket`::
  165. The name of the bucket to be used for snapshots. (Mandatory)
  166. `client`::
  167. The name of the client to use to connect to Google Cloud Storage.
  168. Defaults to `default`.
  169. `base_path`::
  170. Specifies the path within bucket to repository data. Defaults to
  171. the root of the bucket.
  172. `chunk_size`::
  173. Big files can be broken down into chunks during snapshotting if needed.
  174. The chunk size can be specified in bytes or by using size value notation,
  175. i.e. `1g`, `10m`, `5k`. Defaults to `100m`.
  176. `compress`::
  177. When set to `true` metadata files are stored in compressed format. This
  178. setting doesn't affect index files that are already compressed by default.
  179. Defaults to `false`.
  180. `application_name`::
  181. deprecated[7.0.0, This setting is now defined in the <<repository-gcs-client, client settings>>]
  182. [[repository-gcs-bucket-permission]]
  183. ===== Recommended Bucket Permission
  184. The service account used to access the bucket must have the "Writer" access to the bucket:
  185. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  186. 2. Select your project
  187. 3. Got to the https://console.cloud.google.com/storage/browser[Storage Browser]
  188. 4. Select the bucket and "Edit bucket permission"
  189. 5. The service account must be configured as a "User" with "Writer" access