troubleshooting.asciidoc 33 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895
  1. [role="xpack"]
  2. [[security-troubleshooting]]
  3. == Troubleshooting security
  4. ++++
  5. <titleabbrev>Troubleshooting</titleabbrev>
  6. ++++
  7. Use the information in this section to troubleshoot common problems and find
  8. answers for frequently asked questions.
  9. * <<security-trb-settings>>
  10. * <<security-trb-roles>>
  11. * <<security-trb-extraargs>>
  12. * <<trouble-shoot-active-directory>>
  13. * <<trb-security-maccurl>>
  14. * <<trb-security-sslhandshake>>
  15. * <<trb-security-ssl>>
  16. * <<trb-security-kerberos>>
  17. * <<trb-security-saml>>
  18. * <<trb-security-internalserver>>
  19. * <<trb-security-setup>>
  20. * <<trb-security-path>>
  21. For issues that you cannot fix yourself … we’re here to help.
  22. If you are an existing Elastic customer with a support contract, please create
  23. a ticket in the
  24. https://support.elastic.co/customers/s/login/[Elastic Support portal].
  25. Or post in the https://discuss.elastic.co/[Elastic forum].
  26. [[security-trb-settings]]
  27. === Some settings are not returned via the nodes settings API
  28. *Symptoms:*
  29. * When you use the <<cluster-nodes-info,nodes info API>> to retrieve
  30. settings for a node, some information is missing.
  31. *Resolution:*
  32. This is intentional. Some of the settings are considered to be highly
  33. sensitive: all `ssl` settings, ldap `bind_dn`, and `bind_password`.
  34. For this reason, we filter these settings and do not expose them via
  35. the nodes info API rest endpoint. You can also define additional
  36. sensitive settings that should be hidden using the
  37. `xpack.security.hide_settings` setting. For example, this snippet
  38. hides the `url` settings of the `ldap1` realm and all settings of the
  39. `ad1` realm.
  40. [source, yaml]
  41. ------------------------------------------
  42. xpack.security.hide_settings: xpack.security.authc.realms.ldap1.url,
  43. xpack.security.authc.realms.ad1.*
  44. ------------------------------------------
  45. [[security-trb-roles]]
  46. === Authorization exceptions
  47. *Symptoms:*
  48. * I configured the appropriate roles and the users, but I still get an
  49. authorization exception.
  50. * I can authenticate to LDAP, but I still get an authorization exception.
  51. *Resolution:*
  52. . Verify that the role names associated with the users match the roles defined
  53. in the `roles.yml` file. You can use the `elasticsearch-users` tool to list all
  54. the users. Any unknown roles are marked with `*`.
  55. +
  56. --
  57. [source, shell]
  58. ------------------------------------------
  59. bin/elasticsearch-users list
  60. rdeniro : admin
  61. alpacino : power_user
  62. jacknich : monitoring,unknown_role* <1>
  63. ------------------------------------------
  64. <1> `unknown_role` was not found in `roles.yml`
  65. For more information about this command, see the
  66. <<users-command,`elasticsearch-users` command>>.
  67. --
  68. . If you are authenticating to LDAP, a number of configuration options can cause
  69. this error.
  70. +
  71. --
  72. |======================
  73. |_group identification_ |
  74. Groups are located by either an LDAP search or by the "memberOf" attribute on
  75. the user. Also, If subtree search is turned off, it will search only one
  76. level deep. For all the options, see <<ref-ldap-settings>>.
  77. There are many options here and sticking to the defaults will not work for all
  78. scenarios.
  79. | _group to role mapping_|
  80. Either the `role_mapping.yml` file or the location for this file could be
  81. misconfigured. For more information, see <<security-files>>.
  82. |_role definition_|
  83. The role definition might be missing or invalid.
  84. |======================
  85. To help track down these possibilities, enable additional logging to troubleshoot further.
  86. You can enable debug logging by configuring the following persistent setting:
  87. [source, console]
  88. ----
  89. PUT /_cluster/settings
  90. {
  91. "persistent": {
  92. "logger.org.elasticsearch.xpack.security.authc": "debug"
  93. }
  94. }
  95. ----
  96. Alternatively, you can add the following lines to the end of
  97. the `log4j2.properties` configuration file in the `ES_PATH_CONF`:
  98. [source,properties]
  99. ----------------
  100. logger.authc.name = org.elasticsearch.xpack.security.authc
  101. logger.authc.level = DEBUG
  102. ----------------
  103. Refer to <<configuring-logging-levels,configuring logging levels>> for more
  104. information.
  105. A successful authentication should produce debug statements that list groups and
  106. role mappings.
  107. --
  108. [[security-trb-extraargs]]
  109. === Users command fails due to extra arguments
  110. *Symptoms:*
  111. * The `elasticsearch-users` command fails with the following message:
  112. `ERROR: extra arguments [...] were provided`.
  113. *Resolution:*
  114. This error occurs when the `elasticsearch-users` tool is parsing the input and
  115. finds unexpected arguments. This can happen when there are special characters
  116. used in some of the arguments. For example, on Windows systems the `,` character
  117. is considered a parameter separator; in other words `-r role1,role2` is
  118. translated to `-r role1 role2` and the `elasticsearch-users` tool only
  119. recognizes `role1` as an expected parameter. The solution here is to quote the
  120. parameter: `-r "role1,role2"`.
  121. For more information about this command, see
  122. <<users-command,`elasticsearch-users` command>>.
  123. [[trouble-shoot-active-directory]]
  124. === Users are frequently locked out of Active Directory
  125. *Symptoms:*
  126. * Certain users are being frequently locked out of Active Directory.
  127. *Resolution:*
  128. Check your realm configuration; realms are checked serially, one after another.
  129. If your Active Directory realm is being checked before other realms and there
  130. are usernames that appear in both Active Directory and another realm, a valid
  131. login for one realm might be causing failed login attempts in another realm.
  132. For example, if `UserA` exists in both Active Directory and a file realm, and
  133. the Active Directory realm is checked first and file is checked second, an
  134. attempt to authenticate as `UserA` in the file realm would first attempt to
  135. authenticate against Active Directory and fail, before successfully
  136. authenticating against the `file` realm. Because authentication is verified on
  137. each request, the Active Directory realm would be checked - and fail - on each
  138. request for `UserA` in the `file` realm. In this case, while the authentication
  139. request completed successfully, the account on Active Directory would have
  140. received several failed login attempts, and that account might become
  141. temporarily locked out. Plan the order of your realms accordingly.
  142. Also note that it is not typically necessary to define multiple Active Directory
  143. realms to handle domain controller failures. When using Microsoft DNS, the DNS
  144. entry for the domain should always point to an available domain controller.
  145. [[trb-security-maccurl]]
  146. === Certificate verification fails for curl on Mac
  147. *Symptoms:*
  148. * `curl` on the Mac returns a certificate verification error even when the
  149. `--cacert` option is used.
  150. *Resolution:*
  151. Apple's integration of `curl` with their keychain technology disables the
  152. `--cacert` option.
  153. See http://curl.haxx.se/mail/archive-2013-10/0036.html for more information.
  154. You can use another tool, such as `wget`, to test certificates. Alternately, you
  155. can add the certificate for the signing certificate authority MacOS system
  156. keychain, using a procedure similar to the one detailed at the
  157. http://support.apple.com/kb/PH14003[Apple knowledge base]. Be sure to add the
  158. signing CA's certificate and not the server's certificate.
  159. [[trb-security-sslhandshake]]
  160. === SSLHandshakeException causes connections to fail
  161. *Symptoms:*
  162. * A `SSLHandshakeException` causes a connection to a node to fail and indicates
  163. that there is a configuration issue. Some of the common exceptions are shown
  164. below with tips on how to resolve these issues.
  165. *Resolution:*
  166. `java.security.cert.CertificateException: No name matching node01.example.com found`::
  167. +
  168. --
  169. Indicates that a client connection was made to `node01.example.com` but the
  170. certificate returned did not contain the name `node01.example.com`. In most
  171. cases, the issue can be resolved by ensuring the name is specified during
  172. certificate creation. For more information, see <<encrypt-internode-communication>>. Another scenario is
  173. when the environment does not wish to use DNS names in certificates at all. In
  174. this scenario, all settings in `elasticsearch.yml` should only use IP addresses
  175. including the `network.publish_host` setting.
  176. --
  177. `java.security.cert.CertificateException: No subject alternative names present`::
  178. +
  179. --
  180. Indicates that a client connection was made to an IP address but the returned
  181. certificate did not contain any `SubjectAlternativeName` entries. IP addresses
  182. are only used for hostname verification if they are specified as a
  183. `SubjectAlternativeName` during certificate creation. If the intent was to use
  184. IP addresses for hostname verification, then the certificate will need to be
  185. regenerated with the appropriate IP address. See <<encrypt-internode-communication>>.
  186. --
  187. `javax.net.ssl.SSLHandshakeException: null cert chain` and `javax.net.ssl.SSLException: Received fatal alert: bad_certificate`::
  188. +
  189. --
  190. The `SSLHandshakeException` indicates that a self-signed certificate was
  191. returned by the client that is not trusted as it cannot be found in the
  192. `truststore` or `keystore`. This `SSLException` is seen on the client side of
  193. the connection.
  194. --
  195. `sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target` and `javax.net.ssl.SSLException: Received fatal alert: certificate_unknown`::
  196. +
  197. --
  198. This `SunCertPathBuilderException` indicates that a certificate was returned
  199. during the handshake that is not trusted. This message is seen on the client
  200. side of the connection. The `SSLException` is seen on the server side of the
  201. connection. The CA certificate that signed the returned certificate was not
  202. found in the `keystore` or `truststore` and needs to be added to trust this
  203. certificate.
  204. --
  205. `javax.net.ssl.SSLHandshakeException: Invalid ECDH ServerKeyExchange signature`::
  206. +
  207. --
  208. The `Invalid ECDH ServerKeyExchange signature` can indicate that a key and a corresponding certificate don't match and are
  209. causing the handshake to fail.
  210. Verify the contents of each of the files you are using for your configured certificate authorities, certificates and keys. In particular, check that the key and certificate belong to the same key pair.
  211. --
  212. [[trb-security-ssl]]
  213. === Common SSL/TLS exceptions
  214. *Symptoms:*
  215. * You might see some exceptions related to SSL/TLS in your logs. Some of the
  216. common exceptions are shown below with tips on how to resolve these issues. +
  217. *Resolution:*
  218. `WARN: received plaintext http traffic on a https channel, closing connection`::
  219. +
  220. --
  221. Indicates that there was an incoming plaintext http request. This typically
  222. occurs when an external applications attempts to make an unencrypted call to the
  223. REST interface. Please ensure that all applications are using `https` when
  224. calling the REST interface with SSL enabled.
  225. --
  226. `org.elasticsearch.common.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record:`::
  227. +
  228. --
  229. Indicates that there was incoming plaintext traffic on an SSL connection. This
  230. typically occurs when a node is not configured to use encrypted communication
  231. and tries to connect to nodes that are using encrypted communication. Please
  232. verify that all nodes are using the same setting for
  233. `xpack.security.transport.ssl.enabled`.
  234. For more information about this setting, see
  235. <<security-settings>>.
  236. --
  237. `java.io.StreamCorruptedException: invalid internal transport message format, got`::
  238. +
  239. --
  240. Indicates an issue with data received on the transport interface in an unknown
  241. format. This can happen when a node with encrypted communication enabled
  242. connects to a node that has encrypted communication disabled. Please verify that
  243. all nodes are using the same setting for `xpack.security.transport.ssl.enabled`.
  244. For more information about this setting, see
  245. <<security-settings>>.
  246. --
  247. `java.lang.IllegalArgumentException: empty text`::
  248. +
  249. --
  250. This exception is typically seen when a `https` request is made to a node that
  251. is not using `https`. If `https` is desired, please ensure the following setting
  252. is in `elasticsearch.yml`:
  253. [source,yaml]
  254. ----------------
  255. xpack.security.http.ssl.enabled: true
  256. ----------------
  257. For more information about this setting, see
  258. <<security-settings>>.
  259. --
  260. `ERROR: unsupported ciphers [...] were requested but cannot be used in this JVM`::
  261. +
  262. --
  263. This error occurs when a SSL/TLS cipher suite is specified that cannot supported
  264. by the JVM that {es} is running in. Security tries to use the specified cipher
  265. suites that are supported by this JVM. This error can occur when using the
  266. Security defaults as some distributions of OpenJDK do not enable the PKCS11
  267. provider by default. In this case, we recommend consulting your JVM
  268. documentation for details on how to enable the PKCS11 provider.
  269. Another common source of this error is requesting cipher suites that use
  270. encrypting with a key length greater than 128 bits when running on an Oracle JDK.
  271. In this case, you must install the
  272. <<ciphers, JCE Unlimited Strength Jurisdiction Policy Files>>.
  273. --
  274. [[trb-security-kerberos]]
  275. === Common Kerberos exceptions
  276. *Symptoms:*
  277. * User authentication fails due to either GSS negotiation failure
  278. or a service login failure (either on the server or in the {es} http client).
  279. Some of the common exceptions are listed below with some tips to help resolve
  280. them.
  281. *Resolution:*
  282. `Failure unspecified at GSS-API level (Mechanism level: Checksum failed)`::
  283. +
  284. --
  285. When you see this error message on the HTTP client side, then it may be
  286. related to an incorrect password.
  287. When you see this error message in the {es} server logs, then it may be
  288. related to the {es} service keytab. The keytab file is present but it failed
  289. to log in as the user. Please check the keytab expiry. Also check whether the
  290. keytab contain up-to-date credentials; if not, replace them.
  291. You can use tools like `klist` or `ktab` to list principals inside
  292. the keytab and validate them. You can use `kinit` to see if you can acquire
  293. initial tickets using the keytab. Please check the tools and their documentation
  294. in your Kerberos environment.
  295. Kerberos depends on proper hostname resolution, so please check your DNS infrastructure.
  296. Incorrect DNS setup, DNS SRV records or configuration for KDC servers in `krb5.conf`
  297. can cause problems with hostname resolution.
  298. --
  299. `Failure unspecified at GSS-API level (Mechanism level: Request is a replay (34))`::
  300. `Failure unspecified at GSS-API level (Mechanism level: Clock skew too great (37))`::
  301. +
  302. --
  303. To prevent replay attacks, Kerberos V5 sets a maximum tolerance for computer
  304. clock synchronization and it is typically 5 minutes. Please check whether
  305. the time on the machines within the domain is in sync.
  306. --
  307. `gss_init_sec_context() failed: An unsupported mechanism was requested`::
  308. `No credential found for: 1.2.840.113554.1.2.2 usage: Accept`::
  309. +
  310. --
  311. You would usually see this error message on the client side when using `curl` to
  312. test {es} Kerberos setup. For example, these messages occur when you are using
  313. an old version of curl on the client and therefore Kerberos Spnego support is missing.
  314. The Kerberos realm in {es} only supports Spengo mechanism (Oid 1.3.6.1.5.5.2);
  315. it does not yet support Kerberos mechanism (Oid 1.2.840.113554.1.2.2).
  316. Make sure that:
  317. * You have installed curl version 7.49 or above as older versions of curl have
  318. known Kerberos bugs.
  319. * The curl installed on your machine has `GSS-API`, `Kerberos` and `SPNEGO`
  320. features listed when you invoke command `curl -V`. If not, you will need to
  321. compile `curl` version with this support.
  322. To download latest curl version visit https://curl.haxx.se/download.html
  323. --
  324. As Kerberos logs are often cryptic in nature and many things can go wrong
  325. as it depends on external services like DNS and NTP. You might
  326. have to enable additional debug logs to determine the root cause of the issue.
  327. {es} uses a JAAS (Java Authentication and Authorization Service) Kerberos login
  328. module to provide Kerberos support. To enable debug logs on {es} for the login
  329. module use following Kerberos realm setting:
  330. [source,yaml]
  331. ----------------
  332. xpack.security.authc.realms.kerberos.<realm-name>.krb.debug: true
  333. ----------------
  334. For detailed information, see <<ref-kerberos-settings>>.
  335. Sometimes you may need to go deeper to understand the problem during SPNEGO
  336. GSS context negotiation or look at the Kerberos message exchange. To enable
  337. Kerberos/SPNEGO debug logging on JVM, add following JVM system properties:
  338. `-Dsun.security.krb5.debug=true`
  339. `-Dsun.security.spnego.debug=true`
  340. For more information about JVM system properties, see <<set-jvm-options>>.
  341. [[trb-security-saml]]
  342. === Common SAML issues
  343. Some of the common SAML problems are shown below with tips on how to resolve
  344. these issues.
  345. . *Symptoms:*
  346. +
  347. --
  348. Authentication in {kib} fails and the following error is printed in the {es}
  349. logs:
  350. ....
  351. Cannot find any matching realm for [SamlPrepareAuthenticationRequest{realmName=saml1,
  352. assertionConsumerServiceURL=https://my.kibana.url/api/security/saml/callback}]
  353. ....
  354. *Resolution:*
  355. In order to initiate a SAML authentication, {kib} needs to know which SAML realm
  356. it should use from the ones that are configured in {es}. You can use the
  357. `xpack.security.authc.providers.saml.<provider-name>.realm` setting to explicitly
  358. set the SAML realm name in {kib}. It must match the name of the SAML realm that is
  359. configured in {es}.
  360. If you get an error like the one above, it possibly means that the value of
  361. `xpack.security.authc.providers.saml.<provider-name>.realm` in your {kib}
  362. configuration is wrong. Verify that it matches the name of the configured realm
  363. in {es}, which is the string after `xpack.security.authc.realms.saml.` in your
  364. {es} configuration.
  365. --
  366. . *Symptoms:*
  367. +
  368. --
  369. Authentication in {kib} fails and the following error is printed in the
  370. {es} logs:
  371. ....
  372. Authentication to realm saml1 failed - Provided SAML response is not valid for realm
  373. saml/saml1 (Caused by ElasticsearchSecurityException[Conditions
  374. [https://5aadb9778c594cc3aad0efc126a0f92e.kibana.company....ple.com/]
  375. do not match required audience
  376. [https://5aadb9778c594cc3aad0efc126a0f92e.kibana.company.example.com]])
  377. ....
  378. *Resolution:*
  379. We received a SAML response that is addressed to another SAML Service Provider.
  380. This usually means that the configured SAML Service Provider Entity ID in
  381. `elasticsearch.yml` (`sp.entity_id`) does not match what has been configured as
  382. the SAML Service Provider Entity ID in the SAML Identity Provider documentation.
  383. To resolve this issue, ensure that both the saml realm in {es} and the IdP are
  384. configured with the same string for the SAML Entity ID of the Service Provider.
  385. In the {es} log, just before the exception message (above), there will also be
  386. one or more `INFO` level messages of the form
  387. ....
  388. Audience restriction
  389. [https://5aadb9778c594cc3aad0efc126a0f92e.kibana.company.example.com/]
  390. does not match required audience
  391. [https://5aadb9778c594cc3aad0efc126a0f92e.kibana.company.example.com]
  392. (difference starts at character [#68] [/] vs [])
  393. ....
  394. This log message can assist in determining the difference between the value that
  395. was received from the IdP and the value at has been configured in {es}.
  396. The text in parentheses that describes the difference between the two audience
  397. identifiers will only be shown if the two strings are considered to be similar.
  398. TIP: These strings are compared as case-sensitive strings and not as
  399. canonicalized URLs even when the values are URL-like. Be mindful of trailing
  400. slashes, port numbers, etc.
  401. --
  402. . *Symptoms:*
  403. +
  404. --
  405. Authentication in {kib} fails and the following error is printed in the
  406. {es} logs:
  407. ....
  408. Cannot find metadata for entity [your:entity.id] in [metadata.xml]
  409. ....
  410. *Resolution:*
  411. We could not find the metadata for the SAML Entity ID `your:entity.id` in the
  412. configured metadata file (`metadata.xml`).
  413. .. Ensure that the `metadata.xml` file you are using is indeed the one provided
  414. by your SAML Identity Provider.
  415. .. Ensure that the `metadata.xml` file contains one <EntityDescriptor> element
  416. as follows: `<EntityDescriptor ID="0597c9aa-e69b-46e7-a1c6-636c7b8a8070" entityID="https://saml.example.com/f174199a-a96e-4201-88f1-0d57a610c522/" ...`
  417. where the value of the `entityID` attribute is the same as the value of the
  418. `idp.entity_id` that you have set in your SAML realm configuration in
  419. `elasticsearch.yml`.
  420. .. Note that these are also compared as case-sensitive strings and not as
  421. canonicalized URLs even when the values are URL-like.
  422. --
  423. . *Symptoms:*
  424. +
  425. --
  426. Authentication in {kib} fails and the following error is printed in the {es}
  427. logs:
  428. ....
  429. unable to authenticate user [<unauthenticated-saml-user>]
  430. for action [cluster:admin/xpack/security/saml/authenticate]
  431. ....
  432. *Resolution:*
  433. This error indicates that {es} failed to process the incoming SAML
  434. authentication message. Since the message can't be processed, {es} is not aware
  435. of who the to-be authenticated user is and the `<unauthenticated-saml-user>`
  436. placeholder is used instead. To diagnose the _actual_ problem, you must check
  437. the {es} logs for further details.
  438. --
  439. . *Symptoms:*
  440. +
  441. --
  442. Authentication in {kib} fails and the following error is printed in the {es}
  443. logs:
  444. ....
  445. Authentication to realm <saml-realm-name> failed - SAML Attribute [<AttributeName0>] for
  446. [xpack.security.authc.realms.saml.<saml-realm-name>.attributes.principal] not found in saml attributes
  447. [<AttributeName1>=<AttributeValue1>, <AttributeName2>=<AttributeValue2>, ...] or NameID [ NameID(format)=value ]
  448. ....
  449. *Resolution:*
  450. This error indicates that {es} failed to find the necessary SAML attribute in the SAML response that the
  451. Identity Provider sent. In this example, {es} is configured as follows:
  452. ....
  453. xpack.security.authc.realms.saml.<saml-realm-name>.attributes.principal: AttributeName0
  454. ....
  455. This configuration means that {es} expects to find a SAML Attribute with the name `AttributeName0` or a `NameID` with the appropriate format in the SAML
  456. response so that <<saml-attributes-mapping,it can map it>> to the `principal` user property. The `principal` user property is a
  457. mandatory one, so if this mapping can't happen, the authentication fails.
  458. If you are attempting to map a `NameID`, make sure that the expected `NameID` format matches the one that is sent.
  459. See <<saml-attribute-mapping-nameid>> for more details.
  460. If you are attempting to map a SAML attribute and it is not part of the list in the error message, it might mean
  461. that you have misspelled the attribute name, or that the IdP is not sending this particular attribute. You might
  462. be able to use another attribute from the list to map to `principal` or consult with your IdP administrator to
  463. determine if the required attribute can be sent.
  464. --
  465. . *Symptoms:*
  466. +
  467. --
  468. Authentication in {kib} fails and the following error is printed in the {es}
  469. logs:
  470. ....
  471. Cannot find [{urn:oasis:names:tc:SAML:2.0:metadata}IDPSSODescriptor]/[urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect] in descriptor
  472. ....
  473. *Resolution:*
  474. This error indicates that the SAML metadata for your Identity Provider do not contain a `<SingleSignOnService>` endpoint with binding of
  475. HTTP-Redirect (urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect). {es} supports only the `HTTP-Redirect` binding for SAML authentication
  476. requests (and it doesn't support the `HTTP-POST` binding). Consult your IdP administrator in order to enable at least one
  477. `<SingleSignOnService>` supporting `HTTP-Redirect` binding and update your IdP SAML Metadata.
  478. --
  479. . *Symptoms:*
  480. +
  481. --
  482. Authentication in {kib} fails and the following error is printed in the
  483. {es} logs:
  484. ....
  485. Authentication to realm my-saml-realm failed -
  486. Provided SAML response is not valid for realm saml/my-saml-realm
  487. (Caused by ElasticsearchSecurityException[SAML Response is not a 'success' response:
  488. The SAML IdP did not grant the request. It indicated that the Elastic Stack side sent
  489. something invalid (urn:oasis:names:tc:SAML:2.0:status:Requester). Specific status code which might
  490. indicate what the issue is: [urn:oasis:names:tc:SAML:2.0:status:InvalidNameIDPolicy]]
  491. )
  492. ....
  493. *Resolution:*
  494. This means that the SAML Identity Provider failed to authenticate the user and
  495. sent a SAML Response to the Service Provider ({stack}) indicating this failure.
  496. The message will convey whether the SAML Identity Provider thinks that the problem
  497. is with the Service Provider ({stack}) or with the Identity Provider itself and
  498. the specific status code that follows is extremely useful as it usually indicates
  499. the underlying issue. The list of specific error codes is defined in the
  500. https://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf[SAML 2.0 Core specification - Section 3.2.2.2]
  501. and the most commonly encountered ones are:
  502. . `urn:oasis:names:tc:SAML:2.0:status:AuthnFailed`: The SAML Identity Provider failed to
  503. authenticate the user. There is not much to troubleshoot on the {stack} side for this status, the logs of
  504. the SAML Identity Provider will hopefully offer much more information.
  505. . `urn:oasis:names:tc:SAML:2.0:status:InvalidNameIDPolicy`: The SAML Identity Provider cannot support
  506. releasing a NameID with the requested format. When creating SAML Authentication Requests, {es} sets
  507. the NameIDPolicy element of the Authentication request with the appropriate value. This is controlled
  508. by the <<ref-saml-settings,`nameid_format`>> configuration parameter in
  509. `elasticsearch.yml`, which if not set defaults to `urn:oasis:names:tc:SAML:2.0:nameid-format:transient`.
  510. This instructs the Identity Provider to return a NameID with that specific format in the SAML Response. If
  511. the SAML Identity Provider cannot grant that request, for example because it is configured to release a
  512. NameID format with `urn:oasis:names:tc:SAML:2.0:nameid-format:persistent` format instead, it returns this error
  513. indicating an invalid NameID policy. This issue can be resolved by adjusting `nameid_format` to match the format
  514. the SAML Identity Provider can return or by setting it to `urn:oasis:names:tc:SAML:2.0:nameid-format:unspecified`
  515. so that the Identity Provider is allowed to return any format it wants.
  516. --
  517. . *Symptoms:*
  518. +
  519. --
  520. Authentication in {kib} fails and the following error is printed in the
  521. {es} logs:
  522. ....
  523. The XML Signature of this SAML message cannot be validated. Please verify that the saml
  524. realm uses the correct SAMLmetadata file/URL for this Identity Provider
  525. ....
  526. *Resolution:*
  527. This means that {es} failed to validate the digital signature of the SAML
  528. message that the Identity Provider sent. {es} uses the public key of the
  529. Identity Provider that is included in the SAML metadata, in order to validate
  530. the signature that the IdP has created using its corresponding private key.
  531. Failure to do so, can have a number of causes:
  532. .. As the error message indicates, the most common cause is that the wrong
  533. metadata file is used and as such the public key it contains doesn't correspond
  534. to the private key the Identity Provider uses.
  535. .. The configuration of the Identity Provider has changed or the key has been
  536. rotated and the metadata file that {es} is using has not been updated.
  537. .. The SAML Response has been altered in transit and the signature cannot be
  538. validated even though the correct key is used.
  539. NOTE: The private keys and public keys and self-signed X.509 certificates that
  540. are used in SAML for digital signatures as described above have no relation to
  541. the keys and certificates that are used for TLS either on the transport or the
  542. http layer. A failure such as the one described above has nothing to do with
  543. your `xpack.ssl` related configuration.
  544. --
  545. . *Symptoms:*
  546. +
  547. --
  548. Users are unable to login with a local username and password in {kib} because
  549. SAML is enabled.
  550. *Resolution:*
  551. If you want your users to be able to use local credentials to authenticate to
  552. {kib} in addition to using the SAML realm for Single Sign-On, you must enable
  553. the `basic` `authProvider` in {kib}. The process is documented in the
  554. <<saml-kibana-basic, SAML Guide>>
  555. --
  556. . *Symptoms:*
  557. +
  558. --
  559. No SAML request ID values are being passed from {kib} to {es}:
  560. ....
  561. Caused by org.elasticsearch.ElasticsearchSecurityException: SAML content is in-response-to [_A1B2C3D4E5F6G8H9I0] but expected one of []
  562. ....
  563. *Resolution:*
  564. This error indicates that {es} received a SAML response tied to a particular SAML request, but {kib}
  565. didn't explicitly specify ID of that request. This usually means that {kib} cannot find the user session where
  566. it previously stored the SAML request ID.
  567. To resolve this issue, ensure that in your {kib} configuration `xpack.security.sameSiteCookies` is not set to `Strict`.
  568. Depending on your configuration, you may be able to rely on the default value or explicitly set the value to `None`.
  569. For further information,
  570. please read https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie/SameSite[MDN SameSite cookies]
  571. If you serve multiple {kib} installations behind a load balancer make sure to use the
  572. https://www.elastic.co/guide/en/kibana/current/production.html#load-balancing-kibana[same security configuration]
  573. for all installations.
  574. --
  575. *Logging:*
  576. If the previous resolutions do not solve your issue, enable additional
  577. logging for the SAML realm to troubleshoot further. You can enable debug
  578. logging by configuring the following persistent setting:
  579. [source, console]
  580. ----
  581. PUT /_cluster/settings
  582. {
  583. "persistent": {
  584. "logger.org.elasticsearch.xpack.security.authc.saml": "debug"
  585. }
  586. }
  587. ----
  588. Alternatively, you can add the following lines to the end of the
  589. `log4j2.properties` configuration file in the `ES_PATH_CONF`:
  590. [source,properties]
  591. ----------------
  592. logger.saml.name = org.elasticsearch.xpack.security.authc.saml
  593. logger.saml.level = DEBUG
  594. ----------------
  595. Refer to <<configuring-logging-levels,configuring logging levels>> for more
  596. information.
  597. [[trb-security-internalserver]]
  598. === Internal Server Error in Kibana
  599. *Symptoms:*
  600. * In 5.1.1, an `UnhandledPromiseRejectionWarning` occurs and {kib} displays an
  601. Internal Server Error.
  602. //TBD: Is the same true for later releases?
  603. *Resolution:*
  604. If the Security plugin is enabled in {es} but disabled in {kib}, you must
  605. still set `elasticsearch.username` and `elasticsearch.password` in `kibana.yml`.
  606. Otherwise, {kib} cannot connect to {es}.
  607. [[trb-security-setup]]
  608. === Setup-passwords command fails due to connection failure
  609. The <<setup-passwords,elasticsearch-setup-passwords command>> sets
  610. passwords for the built-in users by sending user management API requests. If
  611. your cluster uses SSL/TLS for the HTTP (REST) interface, the command attempts to
  612. establish a connection with the HTTPS protocol. If the connection attempt fails,
  613. the command fails.
  614. *Symptoms:*
  615. . {es} is running HTTPS, but the command fails to detect it and returns the
  616. following errors:
  617. +
  618. --
  619. [source, shell]
  620. ------------------------------------------
  621. Cannot connect to elasticsearch node.
  622. java.net.SocketException: Unexpected end of file from server
  623. ...
  624. ERROR: Failed to connect to elasticsearch at
  625. http://127.0.0.1:9200/_security/_authenticate?pretty.
  626. Is the URL correct and elasticsearch running?
  627. ------------------------------------------
  628. --
  629. . SSL/TLS is configured, but trust cannot be established. The command returns
  630. the following errors:
  631. +
  632. --
  633. [source, shell]
  634. ------------------------------------------
  635. SSL connection to
  636. https://127.0.0.1:9200/_security/_authenticate?pretty
  637. failed: sun.security.validator.ValidatorException:
  638. PKIX path building failed:
  639. sun.security.provider.certpath.SunCertPathBuilderException:
  640. unable to find valid certification path to requested target
  641. Please check the elasticsearch SSL settings under
  642. xpack.security.http.ssl.
  643. ...
  644. ERROR: Failed to establish SSL connection to elasticsearch at
  645. https://127.0.0.1:9200/_security/_authenticate?pretty.
  646. ------------------------------------------
  647. --
  648. . The command fails because hostname verification fails, which results in the
  649. following errors:
  650. +
  651. --
  652. [source, shell]
  653. ------------------------------------------
  654. SSL connection to
  655. https://idp.localhost.test:9200/_security/_authenticate?pretty
  656. failed: java.security.cert.CertificateException:
  657. No subject alternative DNS name matching
  658. elasticsearch.example.com found.
  659. Please check the elasticsearch SSL settings under
  660. xpack.security.http.ssl.
  661. ...
  662. ERROR: Failed to establish SSL connection to elasticsearch at
  663. https://elasticsearch.example.com:9200/_security/_authenticate?pretty.
  664. ------------------------------------------
  665. --
  666. *Resolution:*
  667. . If your cluster uses TLS/SSL for the HTTP interface but the
  668. `elasticsearch-setup-passwords` command attempts to establish a non-secure
  669. connection, use the `--url` command option to explicitly specify an HTTPS URL.
  670. Alternatively, set the `xpack.security.http.ssl.enabled` setting to `true`.
  671. . If the command does not trust the {es} server, verify that you configured the
  672. `xpack.security.http.ssl.certificate_authorities` setting or the
  673. `xpack.security.http.ssl.truststore.path` setting.
  674. . If hostname verification fails, you can disable this verification by setting
  675. `xpack.security.http.ssl.verification_mode` to `certificate`.
  676. For more information about these settings, see
  677. <<security-settings>>.
  678. [[trb-security-path]]
  679. === Failures due to relocation of the configuration files
  680. *Symptoms:*
  681. * Active Directory or LDAP realms might stop working after upgrading to {es} 6.3
  682. or later releases. In 6.4 or later releases, you might see messages in the {es}
  683. log that indicate a config file is in a deprecated location.
  684. *Resolution:*
  685. By default, in 6.2 and earlier releases, the security configuration files are
  686. located in the `ES_PATH_CONF/x-pack` directory, where `ES_PATH_CONF` is an
  687. environment variable that defines the location of the
  688. <<config-files-location,config directory>>.
  689. In 6.3 and later releases, the config directory no longer contains an `x-pack`
  690. directory. The files that were stored in this folder, such as the
  691. `log4j2.properties`, `role_mapping.yml`, `roles.yml`, `users`, and `users_roles`
  692. files, now exist directly in the config directory.
  693. IMPORTANT: If you upgraded to 6.3 or later releases, your old security
  694. configuration files still exist in an `x-pack` folder. That file path is
  695. deprecated, however, and you should move your files out of that folder.
  696. In 6.3 and later releases, settings such as `files.role_mapping` default to
  697. `ES_PATH_CONF/role_mapping.yml`. If you do not want to use the default locations,
  698. you must update the settings appropriately. See
  699. <<security-settings>>.