repository-gcs.asciidoc 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277
  1. [[repository-gcs]]
  2. === Google Cloud Storage repository
  3. You can use the https://cloud.google.com/storage/[Google Cloud Storage]
  4. service as a repository for {ref}/modules-snapshots.html[Snapshot/Restore].
  5. [[repository-gcs-usage]]
  6. ==== Getting started
  7. This repository type uses the https://github.com/GoogleCloudPlatform/google-cloud-java/tree/master/google-cloud-clients/google-cloud-storage[Google Cloud Java Client for Storage]
  8. to connect to the Storage service. If you are using
  9. https://cloud.google.com/storage/[Google Cloud Storage] for the first time, you
  10. must connect to the https://console.cloud.google.com/[Google Cloud Platform Console]
  11. and create a new project. After your project is created, you must enable the
  12. Cloud Storage Service for your project.
  13. [[repository-gcs-creating-bucket]]
  14. ===== Creating a bucket
  15. The Google Cloud Storage service uses the concept of a
  16. https://cloud.google.com/storage/docs/key-terms[bucket] as a container for all
  17. the data. Buckets are usually created using the
  18. https://console.cloud.google.com/[Google Cloud Platform Console]. This
  19. repository type does not automatically create buckets.
  20. To create a new bucket:
  21. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console].
  22. 2. Select your project.
  23. 3. Go to the https://console.cloud.google.com/storage/browser[Storage Browser].
  24. 4. Click the *Create Bucket* button.
  25. 5. Enter the name of the new bucket.
  26. 6. Select a storage class.
  27. 7. Select a location.
  28. 8. Click the *Create* button.
  29. For more detailed instructions, see the
  30. https://cloud.google.com/storage/docs/quickstart-console#create_a_bucket[Google Cloud documentation].
  31. [[repository-gcs-service-authentication]]
  32. ===== Service authentication
  33. The repository must authenticate the requests it makes to the Google Cloud Storage
  34. service. It is common for Google client libraries to employ a strategy named https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application[application default credentials].
  35. However, that strategy is only **partially supported** by Elasticsearch. The
  36. repository operates under the Elasticsearch process, which runs with the security
  37. manager enabled. The security manager obstructs the "automatic" credential discovery
  38. when the environment variable `GOOGLE_APPLICATION_CREDENTIALS` is used to point to a
  39. local file on disk. It can, however, retrieve the service account that is attached to
  40. the resource that is running Elasticsearch, or fall back to the default service
  41. account that Compute Engine, Kubernetes Engine or App Engine provide.
  42. Alternatively, you must configure <<repository-gcs-using-service-account,service account>>
  43. credentials if you are using an environment that does not support automatic
  44. credential discovery.
  45. [[repository-gcs-using-service-account]]
  46. ===== Using a service account
  47. You have to obtain and provide https://cloud.google.com/iam/docs/overview#service_account[service account credentials]
  48. manually.
  49. For detailed information about generating JSON service account files, see the https://cloud.google.com/storage/docs/authentication?hl=en#service_accounts[Google Cloud documentation].
  50. Note that the PKCS12 format is not supported by this repository type.
  51. Here is a summary of the steps:
  52. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console].
  53. 2. Select your project.
  54. 3. Select the https://console.cloud.google.com/iam-admin/serviceaccounts[Service Accounts] tab.
  55. 4. Click *Create service account*.
  56. 5. After the account is created, select it and go to *Keys*.
  57. 6. Select *Add Key* and then *Create new key*.
  58. 7. Select Key Type *JSON* as P12 is unsupported.
  59. A JSON service account file looks like this:
  60. [source,js]
  61. ----
  62. {
  63. "type": "service_account",
  64. "project_id": "your-project-id",
  65. "private_key_id": "...",
  66. "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
  67. "client_email": "service-account-for-your-repository@your-project-id.iam.gserviceaccount.com",
  68. "client_id": "...",
  69. "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  70. "token_uri": "https://accounts.google.com/o/oauth2/token",
  71. "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  72. "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/your-bucket@your-project-id.iam.gserviceaccount.com"
  73. }
  74. ----
  75. // NOTCONSOLE
  76. To provide this file to the repository, it must be stored in the {ref}/secure-settings.html[Elasticsearch keystore]. You must
  77. add a `file` setting with the name `gcs.client.NAME.credentials_file` using the `add-file` subcommand.
  78. `NAME` is the name of the client configuration for the repository. The implicit client
  79. name is `default`, but a different client name can be specified in the
  80. repository settings with the `client` key.
  81. NOTE: Passing the file path via the GOOGLE_APPLICATION_CREDENTIALS environment
  82. variable is **not** supported.
  83. For example, if you added a `gcs.client.my_alternate_client.credentials_file`
  84. setting in the keystore, you can configure a repository to use those credentials
  85. like this:
  86. [source,console]
  87. ----
  88. PUT _snapshot/my_gcs_repository
  89. {
  90. "type": "gcs",
  91. "settings": {
  92. "bucket": "my_bucket",
  93. "client": "my_alternate_client"
  94. }
  95. }
  96. ----
  97. // TEST[skip:we don't have gcs setup while testing this]
  98. The `credentials_file` settings are {ref}/secure-settings.html#reloadable-secure-settings[reloadable].
  99. After you reload the settings, the internal `gcs` clients, which are used to
  100. transfer the snapshot contents, utilize the latest settings from the keystore.
  101. NOTE: Snapshot or restore jobs that are in progress are not preempted by a *reload*
  102. of the client's `credentials_file` settings. They complete using the client as
  103. it was built when the operation started.
  104. [[repository-gcs-client]]
  105. ==== Client settings
  106. The client used to connect to Google Cloud Storage has a number of settings available.
  107. Client setting names are of the form `gcs.client.CLIENT_NAME.SETTING_NAME` and are specified
  108. inside `elasticsearch.yml`. The default client name looked up by a `gcs` repository is
  109. called `default`, but can be customized with the repository setting `client`.
  110. For example:
  111. [source,console]
  112. ----
  113. PUT _snapshot/my_gcs_repository
  114. {
  115. "type": "gcs",
  116. "settings": {
  117. "bucket": "my_bucket",
  118. "client": "my_alternate_client"
  119. }
  120. }
  121. ----
  122. // TEST[skip:we don't have gcs setup while testing this]
  123. Some settings are sensitive and must be stored in the
  124. {ref}/secure-settings.html[Elasticsearch keystore]. This is the case for the service account file:
  125. [source,sh]
  126. ----
  127. bin/elasticsearch-keystore add-file gcs.client.default.credentials_file /path/service-account.json
  128. ----
  129. The following are the available client settings. Those that must be stored in the keystore
  130. are marked as `Secure`.
  131. `credentials_file` ({ref}/secure-settings.html[Secure], {ref}/secure-settings.html#reloadable-secure-settings[reloadable])::
  132. The service account file that is used to authenticate to the Google Cloud Storage service.
  133. `endpoint`::
  134. The Google Cloud Storage service endpoint to connect to. This will be automatically
  135. determined by the Google Cloud Storage client but can be specified explicitly.
  136. `connect_timeout`::
  137. The timeout to establish a connection to the Google Cloud Storage service. The value should
  138. specify the unit. For example, a value of `5s` specifies a 5 second timeout. The value of `-1`
  139. corresponds to an infinite timeout. The default value is 20 seconds.
  140. `read_timeout`::
  141. The timeout to read data from an established connection. The value should
  142. specify the unit. For example, a value of `5s` specifies a 5 second timeout. The value of `-1`
  143. corresponds to an infinite timeout. The default value is 20 seconds.
  144. `application_name`::
  145. Name used by the client when it uses the Google Cloud Storage service. Setting
  146. a custom name can be useful to authenticate your cluster when requests
  147. statistics are logged in the Google Cloud Platform. Default to `repository-gcs`
  148. `project_id`::
  149. The Google Cloud project id. This will be automatically inferred from the credentials file but
  150. can be specified explicitly. For example, it can be used to switch between projects when the
  151. same credentials are usable for both the production and the development projects.
  152. `proxy.host`::
  153. Host name of a proxy to connect to the Google Cloud Storage through.
  154. `proxy.port`::
  155. Port of a proxy to connect to the Google Cloud Storage through.
  156. `proxy.type`::
  157. Proxy type for the client. Supported values are `direct` (no proxy),
  158. `http`, and `socks`. Defaults to `direct`.
  159. [[repository-gcs-repository]]
  160. ==== Repository settings
  161. The `gcs` repository type supports a number of settings to customize how data
  162. is stored in Google Cloud Storage.
  163. These can be specified when creating the repository. For example:
  164. [source,console]
  165. ----
  166. PUT _snapshot/my_gcs_repository
  167. {
  168. "type": "gcs",
  169. "settings": {
  170. "bucket": "my_other_bucket",
  171. "base_path": "dev"
  172. }
  173. }
  174. ----
  175. // TEST[skip:we don't have gcs set up while testing this]
  176. The following settings are supported:
  177. `bucket`::
  178. The name of the bucket to be used for snapshots. (Mandatory)
  179. `client`::
  180. The name of the client to use to connect to Google Cloud Storage.
  181. Defaults to `default`.
  182. `base_path`::
  183. Specifies the path within bucket to repository data. Defaults to
  184. the root of the bucket.
  185. +
  186. NOTE: Don't set `base_path` when configuring a snapshot repository for {ECE}.
  187. {ECE} automatically generates the `base_path` for each deployment so that
  188. multiple deployments may share the same bucket.
  189. `chunk_size`::
  190. Big files can be broken down into multiple smaller blobs in the blob store during snapshotting.
  191. It is not recommended to change this value from its default unless there is an explicit reason for limiting the
  192. size of blobs in the repository. Setting a value lower than the default can result in an increased number of API
  193. calls to the Google Cloud Storage Service during snapshot create as well as restore operations compared to using
  194. the default value and thus make both operations slower as well as more costly.
  195. Specify the chunk size as a value and unit, for example:
  196. `10MB`, `5KB`, `500B`. Defaults to the maximum size of a blob in the Google Cloud Storage Service which is `5TB`.
  197. `compress`::
  198. When set to `true` metadata files are stored in compressed format. This
  199. setting doesn't affect index files that are already compressed by default.
  200. Defaults to `true`.
  201. include::repository-shared-settings.asciidoc[]
  202. `application_name`::
  203. deprecated:[6.3.0, "This setting is now defined in the <<repository-gcs-client, client settings>>."]
  204. Name used by the client when it uses the Google Cloud Storage service.
  205. [[repository-gcs-bucket-permission]]
  206. ===== Recommended bucket permission
  207. The service account used to access the bucket must have the "Writer" access to the bucket:
  208. 1. Connect to the https://console.cloud.google.com/[Google Cloud Platform Console].
  209. 2. Select your project.
  210. 3. Go to the https://console.cloud.google.com/storage/browser[Storage Browser].
  211. 4. Select the bucket and "Edit bucket permission".
  212. 5. The service account must be configured as a "User" with "Writer" access.