troubleshooting.asciidoc 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490
  1. [role="xpack"]
  2. [[security-troubleshooting]]
  3. == {security} Troubleshooting
  4. ++++
  5. <titleabbrev>{security}</titleabbrev>
  6. ++++
  7. Use the information in this section to troubleshoot common problems and find
  8. answers for frequently asked questions.
  9. * <<security-trb-settings>>
  10. * <<security-trb-roles>>
  11. * <<security-trb-extraargs>>
  12. * <<trouble-shoot-active-directory>>
  13. * <<trb-security-maccurl>>
  14. * <<trb-security-sslhandshake>>
  15. * <<trb-security-ssl>>
  16. * <<trb-security-kerberos>>
  17. * <<trb-security-internalserver>>
  18. * <<trb-security-setup>>
  19. To get help, see <<help>>.
  20. [[security-trb-settings]]
  21. === Some settings are not returned via the nodes settings API
  22. *Symptoms:*
  23. * When you use the {ref}/cluster-nodes-info.html[nodes info API] to retrieve
  24. settings for a node, some information is missing.
  25. *Resolution:*
  26. This is intentional. Some of the settings are considered to be highly
  27. sensitive: all `ssl` settings, ldap `bind_dn`, and `bind_password`.
  28. For this reason, we filter these settings and do not expose them via
  29. the nodes info API rest endpoint. You can also define additional
  30. sensitive settings that should be hidden using the
  31. `xpack.security.hide_settings` setting. For example, this snippet
  32. hides the `url` settings of the `ldap1` realm and all settings of the
  33. `ad1` realm.
  34. [source, yaml]
  35. ------------------------------------------
  36. xpack.security.hide_settings: xpack.security.authc.realms.ldap1.url,
  37. xpack.security.authc.realms.ad1.*
  38. ------------------------------------------
  39. [[security-trb-roles]]
  40. === Authorization exceptions
  41. *Symptoms:*
  42. * I configured the appropriate roles and the users, but I still get an
  43. authorization exception.
  44. * I can authenticate to LDAP, but I still get an authorization exception.
  45. *Resolution:*
  46. . Verify that the role names associated with the users match the roles defined
  47. in the `roles.yml` file. You can use the `elasticsearch-users` tool to list all
  48. the users. Any unknown roles are marked with `*`.
  49. +
  50. --
  51. [source, shell]
  52. ------------------------------------------
  53. bin/elasticsearch-users list
  54. rdeniro : admin
  55. alpacino : power_user
  56. jacknich : monitoring,unknown_role* <1>
  57. ------------------------------------------
  58. <1> `unknown_role` was not found in `roles.yml`
  59. For more information about this command, see the
  60. {ref}/users-command.html[`elasticsearch-users` command].
  61. --
  62. . If you are authenticating to LDAP, a number of configuration options can cause
  63. this error.
  64. +
  65. --
  66. |======================
  67. |_group identification_ |
  68. Groups are located by either an LDAP search or by the "memberOf" attribute on
  69. the user. Also, If subtree search is turned off, it will search only one
  70. level deep. See the <<ldap-settings, LDAP Settings>> for all the options.
  71. There are many options here and sticking to the defaults will not work for all
  72. scenarios.
  73. | _group to role mapping_|
  74. Either the `role_mapping.yml` file or the location for this file could be
  75. misconfigured. See <<security-files, Security Files>> for more.
  76. |_role definition_|
  77. The role definition might be missing or invalid.
  78. |======================
  79. To help track down these possibilities, add the following lines to the end of
  80. the `log4j2.properties` configuration file in the `ES_PATH_CONF`:
  81. [source,properties]
  82. ----------------
  83. logger.authc.name = org.elasticsearch.xpack.security.authc
  84. logger.authc.level = DEBUG
  85. ----------------
  86. A successful authentication should produce debug statements that list groups and
  87. role mappings.
  88. --
  89. [[security-trb-extraargs]]
  90. === Users command fails due to extra arguments
  91. *Symptoms:*
  92. * The `elasticsearch-users` command fails with the following message:
  93. `ERROR: extra arguments [...] were provided`.
  94. *Resolution:*
  95. This error occurs when the `elasticsearch-users` tool is parsing the input and
  96. finds unexpected arguments. This can happen when there are special characters
  97. used in some of the arguments. For example, on Windows systems the `,` character
  98. is considered a parameter separator; in other words `-r role1,role2` is
  99. translated to `-r role1 role2` and the `elasticsearch-users` tool only
  100. recognizes `role1` as an expected parameter. The solution here is to quote the
  101. parameter: `-r "role1,role2"`.
  102. For more information about this command, see
  103. {ref}/users-command.html[`elasticsearch-users` command].
  104. [[trouble-shoot-active-directory]]
  105. === Users are frequently locked out of Active Directory
  106. *Symptoms:*
  107. * Certain users are being frequently locked out of Active Directory.
  108. *Resolution:*
  109. Check your realm configuration; realms are checked serially, one after another.
  110. If your Active Directory realm is being checked before other realms and there
  111. are usernames that appear in both Active Directory and another realm, a valid
  112. login for one realm might be causing failed login attempts in another realm.
  113. For example, if `UserA` exists in both Active Directory and a file realm, and
  114. the Active Directory realm is checked first and file is checked second, an
  115. attempt to authenticate as `UserA` in the file realm would first attempt to
  116. authenticate against Active Directory and fail, before successfully
  117. authenticating against the `file` realm. Because authentication is verified on
  118. each request, the Active Directory realm would be checked - and fail - on each
  119. request for `UserA` in the `file` realm. In this case, while the authentication
  120. request completed successfully, the account on Active Directory would have
  121. received several failed login attempts, and that account might become
  122. temporarily locked out. Plan the order of your realms accordingly.
  123. Also note that it is not typically necessary to define multiple Active Directory
  124. realms to handle domain controller failures. When using Microsoft DNS, the DNS
  125. entry for the domain should always point to an available domain controller.
  126. [[trb-security-maccurl]]
  127. === Certificate verification fails for curl on Mac
  128. *Symptoms:*
  129. * `curl` on the Mac returns a certificate verification error even when the
  130. `--cacert` option is used.
  131. *Resolution:*
  132. Apple's integration of `curl` with their keychain technology disables the
  133. `--cacert` option.
  134. See http://curl.haxx.se/mail/archive-2013-10/0036.html for more information.
  135. You can use another tool, such as `wget`, to test certificates. Alternately, you
  136. can add the certificate for the signing certificate authority MacOS system
  137. keychain, using a procedure similar to the one detailed at the
  138. http://support.apple.com/kb/PH14003[Apple knowledge base]. Be sure to add the
  139. signing CA's certificate and not the server's certificate.
  140. [[trb-security-sslhandshake]]
  141. === SSLHandshakeException causes connections to fail
  142. *Symptoms:*
  143. * A `SSLHandshakeException` causes a connection to a node to fail and indicates
  144. that there is a configuration issue. Some of the common exceptions are shown
  145. below with tips on how to resolve these issues.
  146. *Resolution:*
  147. `java.security.cert.CertificateException: No name matching node01.example.com found`::
  148. +
  149. --
  150. Indicates that a client connection was made to `node01.example.com` but the
  151. certificate returned did not contain the name `node01.example.com`. In most
  152. cases, the issue can be resolved by ensuring the name is specified during
  153. certificate creation. For more information, see <<ssl-tls>>. Another scenario is
  154. when the environment does not wish to use DNS names in certificates at all. In
  155. this scenario, all settings in `elasticsearch.yml` should only use IP addresses
  156. including the `network.publish_host` setting.
  157. --
  158. `java.security.cert.CertificateException: No subject alternative names present`::
  159. +
  160. --
  161. Indicates that a client connection was made to an IP address but the returned
  162. certificate did not contain any `SubjectAlternativeName` entries. IP addresses
  163. are only used for hostname verification if they are specified as a
  164. `SubjectAlternativeName` during certificate creation. If the intent was to use
  165. IP addresses for hostname verification, then the certificate will need to be
  166. regenerated with the appropriate IP address. See <<ssl-tls>>.
  167. --
  168. `javax.net.ssl.SSLHandshakeException: null cert chain` and `javax.net.ssl.SSLException: Received fatal alert: bad_certificate`::
  169. +
  170. --
  171. The `SSLHandshakeException` indicates that a self-signed certificate was
  172. returned by the client that is not trusted as it cannot be found in the
  173. `truststore` or `keystore`. This `SSLException` is seen on the client side of
  174. the connection.
  175. --
  176. `sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target` and `javax.net.ssl.SSLException: Received fatal alert: certificate_unknown`::
  177. +
  178. --
  179. This `SunCertPathBuilderException` indicates that a certificate was returned
  180. during the handshake that is not trusted. This message is seen on the client
  181. side of the connection. The `SSLException` is seen on the server side of the
  182. connection. The CA certificate that signed the returned certificate was not
  183. found in the `keystore` or `truststore` and needs to be added to trust this
  184. certificate.
  185. --
  186. [[trb-security-ssl]]
  187. === Common SSL/TLS exceptions
  188. *Symptoms:*
  189. * You might see some exceptions related to SSL/TLS in your logs. Some of the
  190. common exceptions are shown below with tips on how to resolve these issues. +
  191. *Resolution:*
  192. `WARN: received plaintext http traffic on a https channel, closing connection`::
  193. +
  194. --
  195. Indicates that there was an incoming plaintext http request. This typically
  196. occurs when an external applications attempts to make an unencrypted call to the
  197. REST interface. Please ensure that all applications are using `https` when
  198. calling the REST interface with SSL enabled.
  199. --
  200. `org.elasticsearch.common.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record:`::
  201. +
  202. --
  203. Indicates that there was incoming plaintext traffic on an SSL connection. This
  204. typically occurs when a node is not configured to use encrypted communication
  205. and tries to connect to nodes that are using encrypted communication. Please
  206. verify that all nodes are using the same setting for
  207. `xpack.security.transport.ssl.enabled`.
  208. For more information about this setting, see
  209. {ref}/security-settings.html[Security Settings in {es}].
  210. --
  211. `java.io.StreamCorruptedException: invalid internal transport message format, got`::
  212. +
  213. --
  214. Indicates an issue with data received on the transport interface in an unknown
  215. format. This can happen when a node with encrypted communication enabled
  216. connects to a node that has encrypted communication disabled. Please verify that
  217. all nodes are using the same setting for `xpack.security.transport.ssl.enabled`.
  218. For more information about this setting, see
  219. {ref}/security-settings.html[Security Settings in {es}].
  220. --
  221. `java.lang.IllegalArgumentException: empty text`::
  222. +
  223. --
  224. This exception is typically seen when a `https` request is made to a node that
  225. is not using `https`. If `https` is desired, please ensure the following setting
  226. is in `elasticsearch.yml`:
  227. [source,yaml]
  228. ----------------
  229. xpack.security.http.ssl.enabled: true
  230. ----------------
  231. For more information about this setting, see
  232. {ref}/security-settings.html[Security Settings in {es}].
  233. --
  234. `ERROR: unsupported ciphers [...] were requested but cannot be used in this JVM`::
  235. +
  236. --
  237. This error occurs when a SSL/TLS cipher suite is specified that cannot supported
  238. by the JVM that {es} is running in. Security tries to use the specified cipher
  239. suites that are supported by this JVM. This error can occur when using the
  240. Security defaults as some distributions of OpenJDK do not enable the PKCS11
  241. provider by default. In this case, we recommend consulting your JVM
  242. documentation for details on how to enable the PKCS11 provider.
  243. Another common source of this error is requesting cipher suites that use
  244. encrypting with a key length greater than 128 bits when running on an Oracle JDK.
  245. In this case, you must install the
  246. <<ciphers, JCE Unlimited Strength Jurisdiction Policy Files>>.
  247. --
  248. [[trb-security-kerberos]]
  249. === Common Kerberos exceptions
  250. *Symptoms:*
  251. * User authentication fails due to either GSS negotiation failure
  252. or a service login failure (either on the server or in the {es} http client).
  253. Some of the common exceptions are listed below with some tips to help resolve
  254. them.
  255. *Resolution:*
  256. `Failure unspecified at GSS-API level (Mechanism level: Checksum failed)`::
  257. +
  258. --
  259. When you see this error message on the HTTP client side, then it may be
  260. related to an incorrect password.
  261. When you see this error message in the {es} server logs, then it may be
  262. related to the {es} service keytab. The keytab file is present but it failed
  263. to log in as the user. Please check the keytab expiry. Also check whether the
  264. keytab contain up-to-date credentials; if not, replace them.
  265. You can use tools like `klist` or `ktab` to list principals inside
  266. the keytab and validate them. You can use `kinit` to see if you can acquire
  267. initial tickets using the keytab. Please check the tools and their documentation
  268. in your Kerberos environment.
  269. Kerberos depends on proper hostname resolution, so please check your DNS infrastructure.
  270. Incorrect DNS setup, DNS SRV records or configuration for KDC servers in `krb5.conf`
  271. can cause problems with hostname resolution.
  272. --
  273. `Failure unspecified at GSS-API level (Mechanism level: Request is a replay (34))`::
  274. `Failure unspecified at GSS-API level (Mechanism level: Clock skew too great (37))`::
  275. +
  276. --
  277. To prevent replay attacks, Kerberos V5 sets a maximum tolerance for computer
  278. clock synchronization and it is typically 5 minutes. Please check whether
  279. the time on the machines within the domain is in sync.
  280. --
  281. As Kerberos logs are often cryptic in nature and many things can go wrong
  282. as it depends on external services like DNS and NTP. You might
  283. have to enable additional debug logs to determine the root cause of the issue.
  284. {es} uses a JAAS (Java Authentication and Authorization Service) Kerberos login
  285. module to provide Kerberos support. To enable debug logs on {es} for the login
  286. module use following Kerberos realm setting:
  287. [source,yaml]
  288. ----------------
  289. xpack.security.authc.realms.<realm-name>.krb.debug: true
  290. ----------------
  291. For detailed information, see {ref}/security-settings.html#ref-kerberos-settings[Kerberos realm settings].
  292. Sometimes you may need to go deeper to understand the problem during SPNEGO
  293. GSS context negotiation or look at the Kerberos message exchange. To enable
  294. Kerberos/SPNEGO debug logging on JVM, add following JVM system properties:
  295. `-Dsun.security.krb5.debug=true`
  296. `-Dsun.security.spnego.debug=true`
  297. For more information about JVM system properties, see {ref}/jvm-options.html[configuring JVM options].
  298. [[trb-security-internalserver]]
  299. === Internal Server Error in Kibana
  300. *Symptoms:*
  301. * In 5.1.1, an `UnhandledPromiseRejectionWarning` occurs and {kib} displays an
  302. Internal Server Error.
  303. //TBD: Is the same true for later releases?
  304. *Resolution:*
  305. If the Security plugin is enabled in {es} but disabled in {kib}, you must
  306. still set `elasticsearch.username` and `elasticsearch.password` in `kibana.yml`.
  307. Otherwise, {kib} cannot connect to {es}.
  308. [[trb-security-setup]]
  309. === Setup-passwords command fails due to connection failure
  310. The {ref}/setup-passwords.html[elasticsearch-setup-passwords command] sets
  311. passwords for the built-in users by sending user management API requests. If
  312. your cluster uses SSL/TLS for the HTTP (REST) interface, the command attempts to
  313. establish a connection with the HTTPS protocol. If the connection attempt fails,
  314. the command fails.
  315. *Symptoms:*
  316. . {es} is running HTTPS, but the command fails to detect it and returns the
  317. following errors:
  318. +
  319. --
  320. [source, shell]
  321. ------------------------------------------
  322. Cannot connect to elasticsearch node.
  323. java.net.SocketException: Unexpected end of file from server
  324. ...
  325. ERROR: Failed to connect to elasticsearch at
  326. http://127.0.0.1:9200/_xpack/security/_authenticate?pretty.
  327. Is the URL correct and elasticsearch running?
  328. ------------------------------------------
  329. --
  330. . SSL/TLS is configured, but trust cannot be established. The command returns
  331. the following errors:
  332. +
  333. --
  334. [source, shell]
  335. ------------------------------------------
  336. SSL connection to
  337. https://127.0.0.1:9200/_xpack/security/_authenticate?pretty
  338. failed: sun.security.validator.ValidatorException:
  339. PKIX path building failed:
  340. sun.security.provider.certpath.SunCertPathBuilderException:
  341. unable to find valid certification path to requested target
  342. Please check the elasticsearch SSL settings under
  343. xpack.security.http.ssl.
  344. ...
  345. ERROR: Failed to establish SSL connection to elasticsearch at
  346. https://127.0.0.1:9200/_xpack/security/_authenticate?pretty.
  347. ------------------------------------------
  348. --
  349. . The command fails because hostname verification fails, which results in the
  350. following errors:
  351. +
  352. --
  353. [source, shell]
  354. ------------------------------------------
  355. SSL connection to
  356. https://idp.localhost.test:9200/_xpack/security/_authenticate?pretty
  357. failed: java.security.cert.CertificateException:
  358. No subject alternative DNS name matching
  359. elasticsearch.example.com found.
  360. Please check the elasticsearch SSL settings under
  361. xpack.security.http.ssl.
  362. ...
  363. ERROR: Failed to establish SSL connection to elasticsearch at
  364. https://elasticsearch.example.com:9200/_xpack/security/_authenticate?pretty.
  365. ------------------------------------------
  366. --
  367. *Resolution:*
  368. . If your cluster uses TLS/SSL for the HTTP interface but the
  369. `elasticsearch-setup-passwords` command attempts to establish a non-secure
  370. connection, use the `--url` command option to explicitly specify an HTTPS URL.
  371. Alternatively, set the `xpack.security.http.ssl.enabled` setting to `true`.
  372. . If the command does not trust the {es} server, verify that you configured the
  373. `xpack.security.http.ssl.certificate_authorities` setting or the
  374. `xpack.security.http.ssl.truststore.path` setting.
  375. . If hostname verification fails, you can disable this verification by setting
  376. `xpack.security.http.ssl.verification_mode` to `certificate`.
  377. For more information about these settings, see
  378. {ref}/security-settings.html[Security Settings in {es}].