Browse Source

First pass at cat docs.

Andrew Raines 12 years ago
parent
commit
8fabeb1c0b

+ 75 - 0
docs/reference/cat.asciidoc

@@ -0,0 +1,75 @@
+[[cat]]
+= cat APIs
+
+[partintro]
+--
+
+["float",id="intro"]
+== Introduction
+
+JSON is great... for computers.  Even if it's pretty-printed, trying
+to find relationships in the data is tedious.  Human eyes, especially
+when looking at an ssh terminal, need compact and aligned text.  The
+cat API aims to meet this need.
+
+[float]
+[[common-parameters]]
+== Common parameters
+
+[float]
+[[verbose]]
+=== Verbose
+
+Each of the commands accepts a query string parameter `v` to turn on
+verbose output.
+
+[source,shell]
+--------------------------------------------------
+% curl 'localhost:9200/_cat/master?v=true'
+id                     ip        node
+EGtKWZlWQYWDmX29fUnp3Q 127.0.0.1 Grey, Sara
+--------------------------------------------------
+
+[float]
+[[numeric-formats]]
+=== Numeric formats
+
+Many commands provide a few types of numeric output, either a byte
+value or a time value.  By default, these types are human-formatted,
+for example, `3.5mb` instead of `3763212`.  The human values are not
+sortable numerically, so in order to operate on these values where
+order is important, you can change it.
+
+Say you want to find the largest index in your cluster (storage used
+by all the shards, not number of documents).  The `/_cat/indices` API
+is ideal.  We only need to tweak two things.  First, we want to turn
+off human mode.  We'll use a byte-level resolution.  Then we'll pipe
+our output into `sort` using the appropriate column, which in this
+case is the eigth one.
+
+[source,shell]
+--------------------------------------------------
+% curl '192.168.56.10:9200/_cat/indices?bytes=b' | sort -rnk8
+green wiki2 3 0 10000   0 105274918 105274918
+green wiki1 3 0 10000 413 103776272 103776272
+green foo   1 0   227   0   2065131   2065131
+--------------------------------------------------
+
+
+--
+
+include::cat/allocation.asciidoc[]
+
+include::cat/count.asciidoc[]
+
+include::cat/health.asciidoc[]
+
+include::cat/indices.asciidoc[]
+
+include::cat/master.asciidoc[]
+
+include::cat/nodes.asciidoc[]
+
+include::cat/recovery.asciidoc[]
+
+include::cat/shards.asciidoc[]

+ 17 - 0
docs/reference/cat/allocation.asciidoc

@@ -0,0 +1,17 @@
+[[cat-allocation]]
+== Allocation
+
+`allocation` provides a snapshot of how shards have located around the
+cluster and the state of disk usage.
+
+[source,shell]
+--------------------------------------------------
+% curl '192.168.56.10:9200/_cat/allocation?v=1'
+shards diskUsed diskAvail diskRatio ip            node
+     1    5.6gb    72.2gb      7.8% 192.168.56.10 Jarella
+     1    5.6gb    72.2gb      7.8% 192.168.56.30 Solarr
+     1    5.5gb    72.3gb      7.6% 192.168.56.20 Adam II
+--------------------------------------------------
+
+Here we can see that each node has been allocated a single shard and
+that they're all using about the same amount of space.

+ 16 - 0
docs/reference/cat/count.asciidoc

@@ -0,0 +1,16 @@
+[[cat-count]]
+== Count
+
+`count` provides quick access to the document count of the entire
+cluster, or individual indices.
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.10:9200/_cat/indices
+green wiki1 3 0 10000 331 168.5mb 168.5mb
+green wiki2 3 0   428   0     8mb     8mb
+% curl 192.168.56.10:9200/_cat/count
+1384314124582 19:42:04 10428
+% curl 192.168.56.10:9200/_cat/count/wiki2
+1384314139815 19:42:19 428
+--------------------------------------------------

+ 61 - 0
docs/reference/cat/health.asciidoc

@@ -0,0 +1,61 @@
+[[cat-health]]
+== Health
+
+`health` is a terse, one-line representation of the same information
+from `/_cluster/health`. It has one option `ts` to disable the
+timestamping.
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.10:9200/_cat/health
+1384308967 18:16:07 foo green 3 3 3 3 0 0 0
+% curl '192.168.56.10:9200/_cat/health?v=1&ts=0'
+cluster status nodeTotal nodeData shards pri relo init unassign
+foo     green          3        3      3   3    0    0        0
+--------------------------------------------------
+
+A common use of this command is to verify the health is consistent
+across nodes:
+
+[source,shell]
+--------------------------------------------------
+% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/health
+[1] 20:20:52 [SUCCESS] es3.vm
+1384309218 18:20:18 foo green 3 3 3 3 0 0 0
+[2] 20:20:52 [SUCCESS] es1.vm
+1384309218 18:20:18 foo green 3 3 3 3 0 0 0
+[3] 20:20:52 [SUCCESS] es2.vm
+1384309218 18:20:18 foo green 3 3 3 3 0 0 0
+--------------------------------------------------
+
+A less obvious use is to track recovery of a large cluster over
+time. With enough shards, starting a cluster, or even recovering after
+losing a node, can take time (depending on your network & disk). A way
+to track its progress is by using this command in a delayed loop:
+
+[source,shell]
+--------------------------------------------------
+% while true; do curl 192.168.56.10:9200/_cat/health; sleep 120; done
+1384309446 18:24:06 foo red 3 3 20 20 0 0 1812
+1384309566 18:26:06 foo yellow 3 3 950 916 0 12 870
+1384309686 18:28:06 foo yellow 3 3 1328 916 0 12 492
+1384309806 18:30:06 foo green 3 3 1832 916 4 0 0
+^C
+--------------------------------------------------
+
+In this scenario, we can tell that recovery took roughly four minutes.
+If this were going on for hours, we would be able to watch the
+`UNASSIGNED` shards drop precipitously.  If that number remained
+static, we would have an idea that there is a problem.
+
+[float]
+[[timestamp]]
+=== Why the timestamp?
+
+You typically are using the `health` command when a cluster is
+malfunctioning.  During this period, it's extremely important to
+correlate activities across log files, alerting systems, etc.
+
+There are two outputs.  The `HH:MM:SS` output is simply for quick
+human consumption.  The epoch time retains more information, including
+date, and is machine sortable if your recovery spans days.

+ 39 - 0
docs/reference/cat/indices.asciidoc

@@ -0,0 +1,39 @@
+[[cat-indices]]
+== Indices
+
+The `indices` command provides a cross-section of each index.  This
+information *spans nodes*.
+
+[source,shell]
+--------------------------------------------------
+% curl 'localhost:9200/_cat/indices/twi*?v=true'
+health index    pri rep docs docs/del size/pri size/total
+green  twitter    2   0  627        7      2mb        2mb
+green  twitter2   2   0  628        0    2.5mb      2.5mb
+--------------------------------------------------
+
+We can tell quickly how many shards make up an index, the number of
+docs, deleted docs, primary store size, and total store size (all
+shards including replicas).
+
+[float]
+[[examples]]
+=== Examples
+
+Which indices are yellow?
+
+[source,shell]
+--------------------------------------------------
+% curl localhost:9200/_cat/indices | grep ^yell
+yellow foo          5 1   4 0    17kb    17kb
+--------------------------------------------------
+
+What's my largest index by disk usage not including replicas?
+
+[source,shell]
+--------------------------------------------------
+% curl 'localhost:9200/_cat/indices?bytes=b' | sort -rnk7
+green  twitter      2 0 627 7 2123797 2123797
+green  wiki         2 0  59 0  575904  575904
+yellow foo          5 1   4 0   17447   17447
+--------------------------------------------------

+ 27 - 0
docs/reference/cat/master.asciidoc

@@ -0,0 +1,27 @@
+[[cat-master]]
+== Master
+
+`master` doesn't have any extra options. It simply displays the
+master's node ID, bound IP address, and node name.
+
+[source,shell]
+--------------------------------------------------
+% curl 'localhost:9200/_cat/master?v=true'
+id                     ip            node
+Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
+--------------------------------------------------
+
+This information is also available via the `nodes` command, but this
+is slightly shorter when all you want to do, for example, is verify
+all nodes agree on the master:
+
+[source,shell]
+--------------------------------------------------
+% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/master
+[1] 19:16:37 [SUCCESS] es3.vm
+Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
+[2] 19:16:37 [SUCCESS] es2.vm
+Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
+[3] 19:16:37 [SUCCESS] es1.vm
+Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
+--------------------------------------------------

+ 48 - 0
docs/reference/cat/nodes.asciidoc

@@ -0,0 +1,48 @@
+[[cat-nodes]]
+== Nodes
+
+The `nodes` command shows the cluster topology.
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.10:9200/_cat/nodes
+SP4H 4727 192.168.56.30 9300 1.0.0.Beta2 1.6.0_27 72.1gb 35.4 93.9mb 79 239.1mb 0.45 3.4h d m Boneyard
+_uhJ 5134 192.168.56.10 9300 1.0.0.Beta2 1.6.0_27 72.1gb 33.3 93.9mb 85 239.1mb 0.06 3.4h d * Athena
+HfDp 4562 192.168.56.20 9300 1.0.0.Beta2 1.6.0_27 72.2gb 74.5 93.9mb 83 239.1mb 0.12 3.4h d m Zarek
+--------------------------------------------------
+
+The first few columns tell you where your nodes live.  For sanity it
+also tells you what version of ES and the JVM each one runs.
+
+[source,shell]
+--------------------------------------------------
+nodeId pid  ip            port es          jdk
+u2PZ   4234 192.168.56.30 9300 1.0.0.Beta1 1.6.0_27
+URzf   5443 192.168.56.10 9300 1.0.0.Beta1 1.6.0_27
+ActN   3806 192.168.56.20 9300 1.0.0.Beta1 1.6.0_27
+--------------------------------------------------
+
+
+The next few give a picture of your heap, memory, and load.
+
+[source,shell]
+--------------------------------------------------
+diskAvail heapPercent heapMax ramPercent  ramMax load
+   72.1gb        31.3  93.9mb         81 239.1mb 0.24
+   72.1gb        19.6  93.9mb         82 239.1mb 0.05
+   72.2gb        64.9  93.9mb         84 239.1mb 0.12
+--------------------------------------------------
+
+The last columns provide ancillary information that can often be
+useful when looking at the cluster as a whole, particularly large
+ones.  How many master-eligible nodes do I have?  How many client
+nodes?  It looks like someone restarted a node recently; which one was
+it?
+
+[source,shell]
+--------------------------------------------------
+uptime data/client master name
+  3.5h d           m      Boneyard
+  3.5h d           *      Athena
+  3.5h d           m      Zarek
+--------------------------------------------------

+ 57 - 0
docs/reference/cat/recovery.asciidoc

@@ -0,0 +1,57 @@
+[[cat-recovery]]
+== Recovery
+
+`recovery` is a view of shard replication.  It will show information
+anytime data from at least one shard is copying to a different node.
+It can also show up on cluster restarts.  If your recovery process
+seems stuck, try it to see if there's any movement.
+
+As an example, let's enable replicas on a cluster which has two
+indices, three shards each.  Afterward we'll have twelve total shards,
+but before those replica shards are `STARTED`, we'll take a snapshot
+of the recovery:
+
+[source,shell]
+--------------------------------------------------
+% curl -XPUT 192.168.56.30:9200/_settings -d'{"number_of_replicas":1}'
+{"ok":true,"acknowledged":true}
+% curl '192.168.56.30:9200/_cat/recovery?v=true'
+index shard   target recovered     % ip            node 
+wiki1 2     68083830   7865837 11.6% 192.168.56.20 Adam II
+wiki2 1      2542400    444175 17.5% 192.168.56.20 Adam II
+wiki2 2      3242108    329039 10.1% 192.168.56.10 Jarella
+wiki2 0      2614132         0  0.0% 192.168.56.30 Solarr
+wiki1 0     60992898   4719290  7.7% 192.168.56.30 Solarr
+wiki1 1     47630362   6798313 14.3% 192.168.56.10 Jarella
+--------------------------------------------------
+
+We have six total shards in recovery (a replica for each primary), at
+varying points of progress.
+
+Let's restart the cluster and then lose a node.  This output shows us
+what was moving around shortly after the node left the cluster.
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.30:9200/_cat/health; curl 192.168.56.30:9200/_cat/recovery
+1384315040 19:57:20 foo yellow 2 2 8 6 0 4 0
+wiki2 2  1621477        0  0.0% 192.168.56.30 Garrett, Jonathan "John"
+wiki2 0  1307488        0  0.0% 192.168.56.20 Commander Kraken
+wiki1 0 32696794 20984240 64.2% 192.168.56.20 Commander Kraken
+wiki1 1 31123128 21951695 70.5% 192.168.56.30 Garrett, Jonathan "John"
+--------------------------------------------------
+
+[float]
+[[big-percent]]
+=== Why am I seeing recovery percentages greater than 100?
+
+This can happen if a shard copy goes away and comes back while the
+primary was indexing.  The replica shard will catch up with the
+primary by receiving any new segments created during its outage.
+These new segments can contain data from segments it already has
+because they're the result of merging that happened on the primary,
+but now live in different, larger segments.  After the new segments
+are copied over the replica will delete unneeded segments, resulting
+in a dataset that more closely matches the primary (or exactly,
+assuming indexing isn't still happening).
+

+ 90 - 0
docs/reference/cat/shards.asciidoc

@@ -0,0 +1,90 @@
+[[cat-shards]]
+== Shards
+
+The `shards` command is the detailed view of what nodes contain which
+shards.  It will tell you if it's a primary or replica, the number of
+docs, the bytes it takes on disk, and the node where it's located.
+
+Here we see a single index, with three primary shards and no replicas:
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.20:9200/_cat/shards
+wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 Stiletto
+wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 Frankie Raye
+wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 Commander Kraken
+--------------------------------------------------
+
+[[index-pattern]]
+=== Index pattern
+
+If you have many shards, you may wish to limit which indices show up
+in the output.  You can always do this with `grep`, but you can save
+some bandwidth by supplying an index pattern to the end.
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.20:9200/_cat/shards/wiki2
+wiki2 0 p STARTED 197 3.2mb 192.168.56.10 Stiletto
+wiki2 1 p STARTED 205 5.9mb 192.168.56.30 Frankie Raye
+wiki2 2 p STARTED 275 7.8mb 192.168.56.20 Commander Kraken
+--------------------------------------------------
+
+
+[[relocation]]
+=== Relocation
+
+Let's say you've checked your health and you see two relocating
+shards.  Where are they from and where are they going?
+
+[source,shell]
+--------------------------------------------------
+% curl 192.168.56.10:9200/_cat/health
+1384315316 20:01:56 foo green 3 3 12 6 2 0 0
+% curl 192.168.56.10:9200/_cat/shards | fgrep RELO
+wiki1 0 r RELOCATING 3014 31.1mb 192.168.56.20 Commander Kraken -> 192.168.56.30 Frankie Raye
+wiki1 1 r RELOCATING 3013 29.6mb 192.168.56.10 Stiletto -> 192.168.56.30 Frankie Raye
+--------------------------------------------------
+
+[[states]]
+=== Shard states
+
+Before a shard can be used, it goes through an `INITIALIZING` state.
+`shards` can show you which ones.
+
+[source,shell]
+--------------------------------------------------
+% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":1}'
+{"ok":true,"acknowledged":true}
+% curl 192.168.56.20:9200/_cat/shards
+wiki1 0 p STARTED      3014 31.1mb 192.168.56.10 Stiletto
+wiki1 0 r INITIALIZING    0 14.3mb 192.168.56.30 Frankie Raye
+wiki1 1 p STARTED      3013 29.6mb 192.168.56.30 Frankie Raye
+wiki1 1 r INITIALIZING    0 13.1mb 192.168.56.20 Commander Kraken
+wiki1 2 r INITIALIZING    0   14mb 192.168.56.10 Stiletto
+wiki1 2 p STARTED      3973 38.1mb 192.168.56.20 Commander Kraken
+--------------------------------------------------
+
+If a shard cannot be assigned, for example you've overallocated the
+number of replicas for the number of nodes in the cluster, they will
+remain `UNASSIGNED`.
+
+[source,shell]
+--------------------------------------------------
+% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":3}'
+% curl 192.168.56.20:9200/_cat/health
+1384316325 20:18:45 foo yellow 3 3 9 3 0 0 3
+% curl 192.168.56.20:9200/_cat/shards
+wiki1 0 p STARTED    3014 31.1mb 192.168.56.10 Stiletto
+wiki1 0 r STARTED    3014 31.1mb 192.168.56.30 Frankie Raye
+wiki1 0 r STARTED    3014 31.1mb 192.168.56.20 Commander Kraken
+wiki1 0 r UNASSIGNED
+wiki1 1 r STARTED    3013 29.6mb 192.168.56.10 Stiletto
+wiki1 1 p STARTED    3013 29.6mb 192.168.56.30 Frankie Raye
+wiki1 1 r STARTED    3013 29.6mb 192.168.56.20 Commander Kraken
+wiki1 1 r UNASSIGNED
+wiki1 2 r STARTED    3973 38.1mb 192.168.56.10 Stiletto
+wiki1 2 r STARTED    3973 38.1mb 192.168.56.30 Frankie Raye
+wiki1 2 p STARTED    3973 38.1mb 192.168.56.20 Commander Kraken
+wiki1 2 r UNASSIGNED
+--------------------------------------------------

+ 2 - 0
docs/reference/index.asciidoc

@@ -11,6 +11,8 @@ include::search.asciidoc[]
 
 include::indices.asciidoc[]
 
+include::cat.asciidoc[]
+
 include::cluster.asciidoc[]
 
 include::query-dsl.asciidoc[]