discovery-ec2.asciidoc 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241
  1. [[discovery-ec2]]
  2. === EC2 Discovery Plugin
  3. The EC2 discovery plugin uses the https://github.com/aws/aws-sdk-java[AWS API]
  4. to identify the addresses of seed hosts.
  5. *If you are looking for a hosted solution of Elasticsearch on AWS, please visit http://www.elastic.co/cloud.*
  6. :plugin_name: discovery-ec2
  7. include::install_remove.asciidoc[]
  8. [[discovery-ec2-usage]]
  9. ==== Getting started with AWS
  10. The plugin adds a seed hosts provider named `ec2`. This seed hosts provider
  11. finds other Elasticsearch instances in EC2 by querying the AWS metadata
  12. service. Authentication is done using
  13. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html[IAM
  14. Role] credentials by default. To enable the plugin, configure {es} to use the
  15. `ec2` seed hosts provider:
  16. [source,yaml]
  17. ----
  18. discovery.seed_providers: ec2
  19. ----
  20. ==== Settings
  21. EC2 discovery supports a number of settings. Some settings are sensitive and
  22. must be stored in the {ref}/secure-settings.html[elasticsearch keystore]. For
  23. example, to use explicit AWS access keys:
  24. [source,sh]
  25. ----
  26. bin/elasticsearch-keystore add discovery.ec2.access_key
  27. bin/elasticsearch-keystore add discovery.ec2.secret_key
  28. ----
  29. The following are the available discovery settings. All should be prefixed with `discovery.ec2.`.
  30. Those that must be stored in the keystore are marked as `Secure`.
  31. `access_key`::
  32. An ec2 access key. The `secret_key` setting must also be specified. (Secure)
  33. `secret_key`::
  34. An ec2 secret key. The `access_key` setting must also be specified. (Secure)
  35. `session_token`::
  36. An ec2 session token. The `access_key` and `secret_key` settings must also
  37. be specified. (Secure)
  38. `endpoint`::
  39. The ec2 service endpoint to connect to. See
  40. http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region. This
  41. defaults to `ec2.us-east-1.amazonaws.com`.
  42. `protocol`::
  43. The protocol to use to connect to ec2. Valid values are either `http`
  44. or `https`. Defaults to `https`.
  45. `proxy.host`::
  46. The host name of a proxy to connect to ec2 through.
  47. `proxy.port`::
  48. The port of a proxy to connect to ec2 through.
  49. `proxy.username`::
  50. The username to connect to the `proxy.host` with. (Secure)
  51. `proxy.password`::
  52. The password to connect to the `proxy.host` with. (Secure)
  53. `read_timeout`::
  54. The socket timeout for connecting to ec2. The value should specify the unit. For example,
  55. a value of `5s` specifies a 5 second timeout. The default value is 50 seconds.
  56. `groups`::
  57. Either a comma separated list or array based list of (security) groups.
  58. Only instances with the provided security groups will be used in the
  59. cluster discovery. (NOTE: You could provide either group NAME or group
  60. ID.)
  61. `host_type`::
  62. +
  63. --
  64. The type of host type to use to communicate with other instances. Can be
  65. one of `private_ip`, `public_ip`, `private_dns`, `public_dns` or `tag:TAGNAME` where
  66. `TAGNAME` refers to a name of a tag configured for all EC2 instances. Instances which don't
  67. have this tag set will be ignored by the discovery process.
  68. For example if you defined a tag `my-elasticsearch-host` in ec2 and set it to `myhostname1.mydomain.com`, then
  69. setting `host_type: tag:my-elasticsearch-host` will tell Discovery Ec2 plugin to read the host name from the
  70. `my-elasticsearch-host` tag. In this case, it will be resolved to `myhostname1.mydomain.com`.
  71. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html[Read more about EC2 Tags].
  72. Defaults to `private_ip`.
  73. --
  74. `availability_zones`::
  75. Either a comma separated list or array based list of availability zones.
  76. Only instances within the provided availability zones will be used in the
  77. cluster discovery.
  78. `any_group`::
  79. If set to `false`, will require all security groups to be present for the
  80. instance to be used for the discovery. Defaults to `true`.
  81. `node_cache_time`::
  82. How long the list of hosts is cached to prevent further requests to the AWS API.
  83. Defaults to `10s`.
  84. *All* secure settings of this plugin are {ref}/secure-settings.html#reloadable-secure-settings[reloadable].
  85. After you reload the settings, an aws sdk client with the latest settings
  86. from the keystore will be used.
  87. [IMPORTANT]
  88. .Binding the network host
  89. ==============================================
  90. It's important to define `network.host` as by default it's bound to `localhost`.
  91. You can use {ref}/modules-network.html[core network host settings] or
  92. <<discovery-ec2-network-host,ec2 specific host settings>>:
  93. ==============================================
  94. [[discovery-ec2-network-host]]
  95. ===== EC2 Network Host
  96. When the `discovery-ec2` plugin is installed, the following are also allowed
  97. as valid network host settings:
  98. [cols="<,<",options="header",]
  99. |==================================================================
  100. |EC2 Host Value |Description
  101. |`_ec2:privateIpv4_` |The private IP address (ipv4) of the machine.
  102. |`_ec2:privateDns_` |The private host of the machine.
  103. |`_ec2:publicIpv4_` |The public IP address (ipv4) of the machine.
  104. |`_ec2:publicDns_` |The public host of the machine.
  105. |`_ec2:privateIp_` |equivalent to `_ec2:privateIpv4_`.
  106. |`_ec2:publicIp_` |equivalent to `_ec2:publicIpv4_`.
  107. |`_ec2_` |equivalent to `_ec2:privateIpv4_`.
  108. |==================================================================
  109. [[discovery-ec2-permissions]]
  110. ===== Recommended EC2 Permissions
  111. EC2 discovery requires making a call to the EC2 service. You'll want to setup
  112. an IAM policy to allow this. You can create a custom policy via the IAM
  113. Management Console. It should look similar to this.
  114. [source,js]
  115. ----
  116. {
  117. "Statement": [
  118. {
  119. "Action": [
  120. "ec2:DescribeInstances"
  121. ],
  122. "Effect": "Allow",
  123. "Resource": [
  124. "*"
  125. ]
  126. }
  127. ],
  128. "Version": "2012-10-17"
  129. }
  130. ----
  131. // NOTCONSOLE
  132. [[discovery-ec2-filtering]]
  133. ===== Filtering by Tags
  134. The ec2 discovery plugin can also filter machines to include in the cluster
  135. based on tags (and not just groups). The settings to use include the
  136. `discovery.ec2.tag.` prefix. For example, if you defined a tag `stage` in EC2
  137. and set it to `dev`, setting `discovery.ec2.tag.stage` to `dev` will only
  138. filter instances with a tag key set to `stage`, and a value of `dev`. Adding
  139. multiple `discovery.ec2.tag` settings will require all of those tags to be set
  140. for the instance to be included.
  141. One practical use for tag filtering is when an ec2 cluster contains many nodes
  142. that are not master-eligible {es} nodes. In this case, tagging the ec2
  143. instances that _are_ running the master-eligible {es} nodes, and then filtering
  144. by that tag, will help discovery to run more efficiently.
  145. [[discovery-ec2-attributes]]
  146. ===== Automatic Node Attributes
  147. Though not dependent on actually using `ec2` as discovery (but still requires the `discovery-ec2` plugin installed), the
  148. plugin can automatically add node attributes relating to ec2. In the future this may support other attributes, but this will
  149. currently only add an `aws_availability_zone` node attribute, which is the availability zone of the current node. Attributes
  150. can be used to isolate primary and replica shards across availability zones by using the
  151. {ref}/allocation-awareness.html[Allocation Awareness] feature.
  152. In order to enable it, set `cloud.node.auto_attributes` to `true` in the settings. For example:
  153. [source,yaml]
  154. ----
  155. cloud.node.auto_attributes: true
  156. cluster.routing.allocation.awareness.attributes: aws_availability_zone
  157. ----
  158. [[cloud-aws-best-practices]]
  159. ==== Best Practices in AWS
  160. Collection of best practices and other information around running Elasticsearch on AWS.
  161. ===== Instance/Disk
  162. When selecting disk please be aware of the following order of preference:
  163. * https://aws.amazon.com/efs/[EFS] - Avoid as the sacrifices made to offer durability, shared storage, and grow/shrink come at performance cost, such file systems have been known to cause corruption of indices, and due to Elasticsearch being distributed and having built-in replication, the benefits that EFS offers are not needed.
  164. * https://aws.amazon.com/ebs/[EBS] - Works well if running a small cluster (1-2 nodes) and cannot tolerate the loss all storage backing a node easily or if running indices with no replicas. If EBS is used, then leverage provisioned IOPS to ensure performance.
  165. * http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html[Instance Store] - When running clusters of larger size and with replicas the ephemeral nature of Instance Store is ideal since Elasticsearch can tolerate the loss of shards. With Instance Store one gets the performance benefit of having disk physically attached to the host running the instance and also the cost benefit of avoiding paying extra for EBS.
  166. Prefer https://aws.amazon.com/amazon-linux-ami/[Amazon Linux AMIs] as since Elasticsearch runs on the JVM, OS dependencies are very minimal and one can benefit from the lightweight nature, support, and performance tweaks specific to EC2 that the Amazon Linux AMIs offer.
  167. ===== Networking
  168. * Networking throttling takes place on smaller instance types in both the form of https://lab.getbase.com/how-we-discovered-limitations-on-the-aws-tcp-stack/[bandwidth and number of connections]. Therefore if large number of connections are needed and networking is becoming a bottleneck, avoid https://aws.amazon.com/ec2/instance-types/[instance types] with networking labeled as `Moderate` or `Low`.
  169. * When running in multiple http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html[availability zones] be sure to leverage {ref}/allocation-awareness.html[shard allocation awareness] so that not all copies of shard data reside in the same availability zone.
  170. * Do not span a cluster across regions. If necessary, use a cross cluster search.
  171. ===== Misc
  172. * If you have split your nodes into roles, consider https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html[tagging the EC2 instances] by role to make it easier to filter and view your EC2 instances in the AWS console.
  173. * Consider https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/terminating-instances.html#Using_ChangingDisableAPITermination[enabling termination protection] for all of your instances to avoid accidentally terminating a node in the cluster and causing a potentially disruptive reallocation.