1 year ago · 1a55e2fa76
--- a/docs/reference/troubleshooting.asciidoc
+++ b/docs/reference/troubleshooting.asciidoc
@@ -138,3 +138,5 @@ include::troubleshooting/troubleshooting-searches.asciidoc[]
 
				 include::troubleshooting/troubleshooting-shards-capacity.asciidoc[]
			
 
				 
			
 
				 include::troubleshooting/troubleshooting-unbalanced-cluster.asciidoc[]
			
 
				+
			
 
				+include::troubleshooting/diagnostic.asciidoc[]
			
--- a/docs/reference/troubleshooting/diagnostic.asciidoc
+++ b/docs/reference/troubleshooting/diagnostic.asciidoc
@@ -0,0 +1,152 @@
 
				+[[diagnostic]]
			
 
				+== Capturing diagnostics
			
 
				+++++
			
 
				+<titleabbrev>Capture diagnostics</titleabbrev>
			
 
				+++++
			
 
				+:keywords: Elasticsearch diagnostic, diagnostics
			
 
				+
			
 
				+The {es} https://github.com/elastic/support-diagnostics[Support Diagnostic] tool captures a point-in-time snapshot of cluster statistics and most settings. 
			
 
				+It works against all {es} versions. 
			
 
				+
			
 
				+This information can be used to troubleshoot problems with your cluster. For examples of issues that you can troubleshoot using Support Diagnostic tool output, refer to https://www.elastic.co/blog/why-does-elastic-support-keep-asking-for-diagnostic-files[the Elastic blog].
			
 
				+
			
 
				+You can generate diagnostic information using this tool before you contact https://support.elastic.co[Elastic Support] or 
			
 
				+https://discuss.elastic.co[Elastic Discuss] to minimize turnaround time. 
			
 
				+
			
 
				+[discrete]
			
 
				+[[diagnostic-tool-requirements]]
			
 
				+=== Requirements
			
 
				+
			
 
				+-  Java Runtime Environment or Java Development Kit v1.8 or higher
			
 
				+
			
 
				+[discrete]
			
 
				+[[diagnostic-tool-access]]
			
 
				+=== Access the tool
			
 
				+
			
 
				+The Support Diagnostic tool is included as a sub-library in some Elastic deployments: 
			
 
				+
			
 
				+* {ece}: Located under **{ece}** > **Deployment** > **Operations** > 
			
 
				+**Prepare Bundle** > **{es}**. 
			
 
				+* {eck}: Run as https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-take-eck-dump.html[`eck-diagnostics`].
			
 
				+
			
 
				+You can also directly download the `diagnostics-X.X.X-dist.zip` file for the latest Support Diagnostic release
			
 
				+from https://github.com/elastic/support-diagnostics/releases/latest[the `support-diagnostic` repo].
			
 
				+
			
 
				+
			
 
				+[discrete]
			
 
				+[[diagnostic-capture]]
			
 
				+=== Capture diagnostic information
			
 
				+
			
 
				+To capture an {es} diagnostic: 
			
 
				+
			
 
				+. In a terminal, verify that your network and user permissions are sufficient to connect to your {es} 
			
 
				+cluster by polling the cluster's <<cluster-health,health>>.
			
 
				++
			
 
				+For example, with the parameters `host:localhost`, `port:9200`, and `username:elastic`, you'd use the following curl request:
			
 
				++
			
 
				+[source,sh]
			
 
				+----
			
 
				+curl -X GET -k -u elastic -p https://localhost:9200/_cluster/health
			
 
				+----
			
 
				+// NOTCONSOLE
			
 
				++
			
 
				+If you receive a an HTTP 200 `OK` response, then you can proceed to the next step. If you receive a different 
			
 
				+response code, then <<diagnostic-non-200,diagnose the issue>> before proceeding.
			
 
				+
			
 
				+. Using the same environment parameters, run the diagnostic tool script. 
			
 
				++
			
 
				+For information about the parameters that you can pass to the tool, refer to the https://github.com/elastic/support-diagnostics#standard-options[diagnostic 
			
 
				+parameter reference]. 
			
 
				++
			
 
				+The following command options are recommended:
			
 
				++
			
 
				+**Unix-based systems**
			
 
				++
			
 
				+[source,sh]
			
 
				+----
			
 
				+sudo ./diagnostics.sh --type local --host localhost --port 9200 -u elastic -p --bypassDiagVerify --ssl --noVerify
			
 
				+----
			
 
				++
			
 
				+**Windows**
			
 
				++
			
 
				+[source,sh]
			
 
				+----
			
 
				+sudo .\diagnostics.bat --type local --host localhost --port 9200 -u elastic -p --bypassDiagVerify --ssl --noVerify
			
 
				+----
			
 
				++
			
 
				+[TIP]
			
 
				+.Script execution modes
			
 
				+====
			
 
				+You can execute the script in three https://github.com/elastic/support-diagnostics#diagnostic-types[modes]: 
			
 
				+
			
 
				+* `local` (default, recommended): Polls the <<rest-apis,{es} API>>, 
			
 
				+gathers operating system info, and captures cluster and GC logs. 
			
 
				+
			
 
				+* `remote`: Establishes an ssh session 
			
 
				+to the applicable target server to pull the same information as `local`.
			
 
				+
			
 
				+* `api`: Polls the <<rest-apis,{es} API>>. All other data must be 
			
 
				+collected manually.
			
 
				+====
			
 
				+
			
 
				+. When the script has completed, verify that no errors were logged to `diagnostic.log`. 
			
 
				+If the log file contains errors, then refer to <<diagnostic-log-errors,Diagnose errors in `diagnostic.log`>>.
			
 
				+
			
 
				+. If the script completed without errors, then an archive with the format `<diagnostic type>-diagnostics-<DateTimeStamp>.zip` is created in the working directory, or an output directory you have specified. You can review or share the diagnostic archive as needed.
			
 
				+
			
 
				+[discrete]
			
 
				+[[diagnostic-non-200]]
			
 
				+=== Diagnose a non-200 cluster health response
			
 
				+
			
 
				+When you poll your cluster health, if you receive any response other than `200 0K`, then the diagnostic tool 
			
 
				+might not work as intended. The following are possible error codes and their resolutions:
			
 
				+
			
 
				+HTTP 401 `UNAUTHENTICATED`::
			
 
				+Additional information in the error will usually indicate either 
			
 
				+that your `username:password` pair is invalid, or that your `.security` 
			
 
				+index is unavailable and you need to setup a temporary 
			
 
				+<<file-realm,file-based realm>> user with `role:superuser` to authenticate.
			
 
				+
			
 
				+HTTP 403 `UNAUTHORIZED`::
			
 
				+Your `username` is recognized but 
			
 
				+has insufficient permissions to run the diagnostic. Either use a different 
			
 
				+username or elevate the user's privileges.
			
 
				+
			
 
				+HTTP 429 `TOO_MANY_REQUESTS` (for example, `circuit_breaking_exception`)::
			
 
				+Your username authenticated and authorized, but the cluster is under 
			
 
				+sufficiently high strain that it's not responding to API calls. These 
			
 
				+responses are usually intermittent. You can proceed with running the diagnostic, 
			
 
				+but the diagnostic results might be incomplete.
			
 
				+
			
 
				+HTTP 504 `BAD_GATEWAY`::
			
 
				+Your network is experiencing issues reaching the cluster. You might be using a proxy or firewall. 
			
 
				+Consider running the diagnostic tool from a different location, confirming your port, or using an IP
			
 
				+instead of a URL domain. 
			
 
				+
			
 
				+HTTP 503 `SERVICE_UNAVAILABLE` (for example, `master_not_discovered_exception`)::
			
 
				+Your cluster does not currently have an elected master node, which is 
			
 
				+required for it to be API-responsive. This might be temporary while the master 
			
 
				+node rotates. If the issue persists, then <<cluster-fault-detection,investigate the cause>> 
			
 
				+before proceeding. 
			
 
				+
			
 
				+[discrete]
			
 
				+[[diagnostic-log-errors]]
			
 
				+=== Diagnose errors in `diagnostic.log`
			
 
				+
			
 
				+The following are common errors that you might encounter when running the diagnostic tool:
			
 
				+
			
 
				+* `Error: Could not find or load main class com.elastic.support.diagnostics.DiagnosticApp`
			
 
				++
			
 
				+This indicates that you accidentally downloaded the source code file 
			
 
				+instead of `diagnostics-X.X.X-dist.zip` from the releases page.
			
 
				+
			
 
				+* `Could not retrieve the Elasticsearch version due to a system or network error - unable to continue.` 
			
 
				++ 
			
 
				+This indicates that the diagnostic couldn't run commands against the cluster. 
			
 
				+Poll the cluster's health again, and ensure that you're using the same parameters 
			
 
				+when you run the dianostic batch or shell file.
			
 
				+
			
 
				+* A `security_exception` that includes `is unauthorized for user`:
			
 
				++
			
 
				+The provided user has insufficient admin permissions to run the diagnostic tool. Use another
			
 
				+user, or grant the user `role:superuser` privileges.