6 years ago · d0faf354c6
--- a/docs/reference/search/profile.asciidoc
+++ b/docs/reference/search/profile.asciidoc
@@ -1,20 +1,30 @@
 
				 [[search-profile]]
			
 
				 === Profile API
			
 
				 
			
 
				-WARNING:  The Profile API is a debugging tool and adds significant overhead to search execution.
			
 
				+WARNING: The Profile API is a debugging tool and adds significant overhead to search execution.
			
 
				 
			
 
				-The Profile API provides detailed timing information about the execution of individual components
			
 
				-in a search request.  It gives the user insight into how search requests are executed at a low level so that
			
 
				-the user can understand why certain requests are slow, and take steps to improve them.
			
 
				-Note that the Profile API, <<profile-limitations, amongst other things>>, doesn't measure
			
 
				-network latency, time spent in the search fetch phase, time spent while the requests spends
			
 
				-in queues or while merging shard responses on the coordinating node.
			
 
				+Provides detailed timing information about the execution of individual 
			
 
				+components in a search request.
			
 
				 
			
 
				-The output from the Profile API is *very* verbose, especially for complicated requests executed across
			
 
				-many shards. Pretty-printing the response is recommended to help understand the output
			
 
				 
			
 
				-[float]
			
 
				-==== Usage
			
 
				+[[search-profile-api-desc]]
			
 
				+==== {api-description-title}
			
 
				+
			
 
				+The Profile API gives the user insight into how search requests are executed at 
			
 
				+a low level so that the user can understand why certain requests are slow, and 
			
 
				+take steps to improve them. Note that the Profile API, 
			
 
				+<<profile-limitations, amongst other things>>, doesn't measure network latency, 
			
 
				+time spent in the search fetch phase, time spent while the requests spends in 
			
 
				+queues or while merging shard responses on the coordinating node.
			
 
				+
			
 
				+The output from the Profile API is *very* verbose, especially for complicated 
			
 
				+requests executed across many shards. Pretty-printing the response is 
			
 
				+recommended to help understand the output.
			
 
				+
			
 
				+
			
 
				+[[search-profile-api-example]]
			
 
				+==== {api-examples-title}
			
 
				+
			
 
				 
			
 
				 Any `_search` request can be profiled by adding a top-level `profile` parameter:
			
 
				 
			
@@ -31,9 +41,10 @@ GET /twitter/_search
 
				 // TEST[setup:twitter]
			
 
				 
			
 
				 <1> Setting the top-level `profile` parameter to `true` will enable profiling
			
 
				-for the search
			
 
				+for the search.
			
 
				+
			
 
				 
			
 
				-This will yield the following result:
			
 
				+The API returns the following result:
			
 
				 
			
 
				 [source,console-result]
			
 
				 --------------------------------------------------
			
@@ -167,12 +178,13 @@ This will yield the following result:
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 // TESTRESPONSE[s/\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[0\]/$body.$_path/]
			
 
				 
			
 
				-<1> Search results are returned, but were omitted here for brevity
			
 
				+<1> Search results are returned, but were omitted here for brevity.
			
 
				 
			
 
				-Even for a simple query, the response is relatively complicated.  Let's break it down piece-by-piece before moving
			
 
				-to more complex examples.
			
 
				+Even for a simple query, the response is relatively complicated.  Let's break it 
			
 
				+down piece-by-piece before moving to more complex examples.
			
 
				 
			
 
				-First, the overall structure of the profile response is as follows:
			
 
				+
			
 
				+The overall structure of the profile response is as follows:
			
 
				 
			
 
				 [source,console-result]
			
 
				 --------------------------------------------------
			
@@ -200,27 +212,37 @@ First, the overall structure of the profile response is as follows:
 
				 // TESTRESPONSE[s/"query": \[...\]/"query": $body.$_path/]
			
 
				 // TESTRESPONSE[s/"collector": \[...\]/"collector": $body.$_path/]
			
 
				 // TESTRESPONSE[s/"aggregations": \[...\]/"aggregations": []/]
			
 
				-<1> A profile is returned for each shard that participated in the response, and is identified
			
 
				-by a unique ID
			
 
				-<2> Each profile contains a section which holds details about the query execution
			
 
				-<3> Each profile has a single time representing the cumulative rewrite time
			
 
				-<4> Each profile also contains a section about the Lucene Collectors which run the search
			
 
				-<5> Each profile contains a section which holds the details about the aggregation execution
			
 
				-
			
 
				-Because a search request may be executed against one or more shards in an index, and a search may cover
			
 
				-one or more indices, the top level element in the profile response is an array of `shard` objects.
			
 
				-Each shard object lists its `id` which uniquely identifies the shard.  The ID's format is
			
 
				+<1> A profile is returned for each shard that participated in the response, and 
			
 
				+is identified by a unique ID.
			
 
				+<2> Each profile contains a section which holds details about the query 
			
 
				+execution.
			
 
				+<3> Each profile has a single time representing the cumulative rewrite time.
			
 
				+<4> Each profile also contains a section about the Lucene Collectors which run 
			
 
				+the search.
			
 
				+<5> Each profile contains a section which holds the details about the 
			
 
				+aggregation execution.
			
 
				+
			
 
				+Because a search request may be executed against one or more shards in an index, 
			
 
				+and a search may cover one or more indices, the top level element in the profile 
			
 
				+response is an array of `shard` objects. Each shard object lists its `id` which 
			
 
				+uniquely identifies the shard.  The ID's format is 
			
 
				 `[nodeID][indexName][shardID]`.
			
 
				 
			
 
				-The profile itself may consist of one or more "searches", where a search is a query executed against the underlying
			
 
				-Lucene index.  Most search requests submitted by the user will only execute a single `search` against the Lucene index.
			
 
				-But occasionally multiple searches will be executed, such as including a global aggregation (which needs to execute
			
 
				-a secondary "match_all" query for the global context).
			
 
				+The profile itself may consist of one or more "searches", where a search is a 
			
 
				+query executed against the underlying Lucene index.  Most search requests 
			
 
				+submitted by the user will only execute a single `search` against the Lucene 
			
 
				+index. But occasionally multiple searches will be executed, such as including a 
			
 
				+global aggregation (which needs to execute a secondary "match_all" query for the 
			
 
				+global context).
			
 
				 
			
 
				 Inside each `search` object there will be two arrays of profiled information:
			
 
				-a `query` array and a `collector` array.  Alongside the `search` object is an `aggregations` object that contains the profile information for the aggregations. In the future, more sections may be added, such as `suggest`, `highlight`, etc.
			
 
				+a `query` array and a `collector` array.  Alongside the `search` object is an 
			
 
				+`aggregations` object that contains the profile information for the 
			
 
				+aggregations. In the future, more sections may be added, such as `suggest`, 
			
 
				+`highlight`, etc.
			
 
				 
			
 
				-There will also be a `rewrite` metric showing the total time spent rewriting the query (in nanoseconds).
			
 
				+There will also be a `rewrite` metric showing the total time spent rewriting the 
			
 
				+query (in nanoseconds).
			
 
				 
			
 
				 NOTE: As with other statistics apis, the Profile API supports human readable outputs. This can be turned on by adding
			
 
				 `?human=true` to the query string. In this case, the output contains the additional `time` field containing rounded,
			
@@ -245,10 +267,11 @@ the `advance` phase of that query is the cause, for example.
 
				 [[query-section]]
			
 
				 ===== `query` Section
			
 
				 
			
 
				-The `query` section contains detailed timing of the query tree executed by Lucene on a particular shard.
			
 
				-The overall structure of this query tree will resemble your original Elasticsearch query, but may be slightly
			
 
				-(or sometimes very) different.  It will also use similar but not always identical naming.  Using our previous
			
 
				-`match` query example, let's analyze the `query` section:
			
 
				+The `query` section contains detailed timing of the query tree executed by 
			
 
				+Lucene on a particular shard. The overall structure of this query tree will 
			
 
				+resemble your original Elasticsearch query, but may be slightly (or sometimes 
			
 
				+very) different.  It will also use similar but not always identical naming. 
			
 
				+Using our previous `match` query example, let's analyze the `query` section:
			
 
				 
			
 
				 [source,console-result]
			
 
				 --------------------------------------------------
			
@@ -279,25 +302,30 @@ The overall structure of this query tree will resemble your original Elasticsear
 
				 // TESTRESPONSE[s/]$/],"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
			
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 // TESTRESPONSE[s/"breakdown": \{...\}/"breakdown": $body.$_path/]
			
 
				-<1> The breakdown timings are omitted for simplicity
			
 
				+<1> The breakdown timings are omitted for simplicity.
			
 
				 
			
 
				-Based on the profile structure, we can see that our `match` query was rewritten by Lucene into a BooleanQuery with two
			
 
				-clauses (both holding a TermQuery).  The `type` field displays the Lucene class name, and often aligns with
			
 
				-the equivalent name in Elasticsearch.  The `description` field displays the Lucene explanation text for the query, and
			
 
				-is made available to help differentiating between parts of your query (e.g. both `message:search` and `message:test`
			
 
				-are TermQuery's and would appear identical otherwise.
			
 
				+Based on the profile structure, we can see that our `match` query was rewritten 
			
 
				+by Lucene into a BooleanQuery with two clauses (both holding a TermQuery).  The 
			
 
				+`type` field displays the Lucene class name, and often aligns with the 
			
 
				+equivalent name in Elasticsearch.  The `description` field displays the Lucene 
			
 
				+explanation text for the query, and is made available to help differentiating 
			
 
				+between parts of your query (e.g. both `message:search` and `message:test` are 
			
 
				+TermQuery's and would appear identical otherwise.
			
 
				 
			
 
				-The `time_in_nanos` field shows that this query took ~1.8ms for the entire BooleanQuery to execute.  The recorded time is inclusive
			
 
				-of all children.
			
 
				+The `time_in_nanos` field shows that this query took ~1.8ms for the entire 
			
 
				+BooleanQuery to execute.  The recorded time is inclusive of all children.
			
 
				 
			
 
				-The `breakdown` field will give detailed stats about how the time was spent, we'll look at
			
 
				-that in a moment.  Finally, the `children` array lists any sub-queries that may be present.  Because we searched for two
			
 
				-values ("search test"), our BooleanQuery holds two children TermQueries.  They have identical information (type, time,
			
 
				-breakdown, etc).  Children are allowed to have their own children.
			
 
				+The `breakdown` field will give detailed stats about how the time was spent, 
			
 
				+we'll look at that in a moment.  Finally, the `children` array lists any 
			
 
				+sub-queries that may be present.  Because we searched for two values ("search 
			
 
				+test"), our BooleanQuery holds two children TermQueries.  They have identical 
			
 
				+information (type, time, breakdown, etc).  Children are allowed to have their 
			
 
				+own children.
			
 
				 
			
 
				-====== Timing Breakdown
			
 
				+===== Timing Breakdown
			
 
				 
			
 
				-The `breakdown` component lists detailed timing statistics about low-level Lucene execution:
			
 
				+The `breakdown` component lists detailed timing statistics about low-level 
			
 
				+Lucene execution:
			
 
				 
			
 
				 [source,console-result]
			
 
				 --------------------------------------------------
			
@@ -326,10 +354,12 @@ The `breakdown` component lists detailed timing statistics about low-level Lucen
 
				 // TESTRESPONSE[s/}$/},\n"children": $body.$_path}],\n"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
			
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 
			
 
				-Timings are listed in wall-clock nanoseconds and are not normalized at all.  All caveats about the overall
			
 
				-`time_in_nanos` apply here.  The intention of the breakdown is to give you a feel for A) what machinery in Lucene is
			
 
				-actually eating time, and B) the magnitude of differences in times between the various components.  Like the overall time,
			
 
				-the breakdown is inclusive of all children times.
			
 
				+Timings are listed in wall-clock nanoseconds and are not normalized at all. All 
			
 
				+caveats about the overall `time_in_nanos` apply here.  The intention of the 
			
 
				+breakdown is to give you a feel for A) what machinery in Lucene is actually 
			
 
				+eating time, and B) the magnitude of differences in times between the various 
			
 
				+components. Like the overall time, the breakdown is inclusive of all children 
			
 
				+times.
			
 
				 
			
 
				 The meaning of the stats are as follows:
			
 
				 
			
@@ -399,13 +429,15 @@ The meaning of the stats are as follows:
 
				     means the `nextDoc()` method was called on two different documents.  This can be used to help judge
			
 
				     how selective queries are, by comparing counts between different query components.
			
 
				 
			
 
				+
			
 
				 [[collectors-section]]
			
 
				 ===== `collectors` Section
			
 
				 
			
 
				-The Collectors portion of the response shows high-level execution details. Lucene works by defining a "Collector"
			
 
				-which is responsible for coordinating the traversal, scoring, and collection of matching documents.  Collectors
			
 
				-are also how a single query can record aggregation results, execute unscoped "global" queries, execute post-query
			
 
				-filters, etc.
			
 
				+The Collectors portion of the response shows high-level execution details. 
			
 
				+Lucene works by defining a "Collector" which is responsible for coordinating the 
			
 
				+traversal, scoring, and collection of matching documents.  Collectors are also 
			
 
				+how a single query can record aggregation results, execute unscoped "global" 
			
 
				+queries, execute post-query filters, etc.
			
 
				 
			
 
				 Looking at the previous example:
			
 
				 
			
@@ -430,15 +462,20 @@ Looking at the previous example:
 
				 // TESTRESPONSE[s/]$/]}], "aggregations": []}]}}/]
			
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 
			
 
				-We see a single collector named `SimpleTopScoreDocCollector` wrapped into `CancellableCollector`. `SimpleTopScoreDocCollector` is the default "scoring and sorting"
			
 
				-`Collector` used by Elasticsearch.  The `reason` field attempts to give a plain English description of the class name.  The
			
 
				-`time_in_nanos` is similar to the time in the Query tree: a wall-clock time inclusive of all children.  Similarly, `children` lists
			
 
				-all sub-collectors. The `CancellableCollector` that wraps `SimpleTopScoreDocCollector` is used by Elasticsearch to detect if the current
			
 
				-search was cancelled and stop collecting documents as soon as it occurs.
			
 
				 
			
 
				-It should be noted that Collector times are **independent** from the Query times.  They are calculated, combined,
			
 
				-and normalized independently!  Due to the nature of Lucene's execution, it is impossible to "merge" the times
			
 
				-from the Collectors into the Query section, so they are displayed in separate portions.
			
 
				+We see a single collector named `SimpleTopScoreDocCollector` wrapped into 
			
 
				+`CancellableCollector`. `SimpleTopScoreDocCollector` is the default "scoring and 
			
 
				+sorting" `Collector` used by {es}. The `reason` field attempts to give a plain 
			
 
				+English description of the class name.  The `time_in_nanos` is similar to the 
			
 
				+time in the Query tree: a wall-clock time inclusive of all children. Similarly, 
			
 
				+`children` lists all sub-collectors. The `CancellableCollector` that wraps 
			
 
				+`SimpleTopScoreDocCollector` is used by {es} to detect if the current search was 
			
 
				+cancelled and stop collecting documents as soon as it occurs.
			
 
				+
			
 
				+It should be noted that Collector times are **independent** from the Query 
			
 
				+times.  They are calculated, combined, and normalized independently! Due to the 
			
 
				+nature of Lucene's execution, it is impossible to "merge" the times from the 
			
 
				+Collectors into the Query section, so they are displayed in separate portions.
			
 
				 
			
 
				 For reference, the various collector reasons are:
			
 
				 
			
@@ -489,20 +526,22 @@ For reference, the various collector reasons are:
 
				 [[rewrite-section]]
			
 
				 ===== `rewrite` Section
			
 
				 
			
 
				-All queries in Lucene undergo a "rewriting" process.  A query (and its sub-queries) may be rewritten one or
			
 
				-more times, and the process continues until the query stops changing.  This process allows Lucene to perform
			
 
				-optimizations, such as removing redundant clauses, replacing one query for a more efficient execution path,
			
 
				-etc.  For example a Boolean -> Boolean -> TermQuery can be rewritten to a TermQuery, because all the Booleans
			
 
				-are unnecessary in this case.
			
 
				+All queries in Lucene undergo a "rewriting" process.  A query (and its 
			
 
				+sub-queries) may be rewritten one or more times, and the process continues until 
			
 
				+the query stops changing.  This process allows Lucene to perform optimizations, 
			
 
				+such as removing redundant clauses, replacing one query for a more efficient 
			
 
				+execution path, etc. For example a Boolean -> Boolean -> TermQuery can be 
			
 
				+rewritten to a TermQuery, because all the Booleans are unnecessary in this case.
			
 
				 
			
 
				-The rewriting process is complex and difficult to display, since queries can change drastically.  Rather than
			
 
				-showing the intermediate results, the total rewrite time is simply displayed as a value (in nanoseconds).  This
			
 
				-value is cumulative and contains the total time for all queries being rewritten.
			
 
				+The rewriting process is complex and difficult to display, since queries can 
			
 
				+change drastically. Rather than showing the intermediate results, the total 
			
 
				+rewrite time is simply displayed as a value (in nanoseconds). This value is 
			
 
				+cumulative and contains the total time for all queries being rewritten.
			
 
				 
			
 
				 ===== A more complex example
			
 
				 
			
 
				-
			
 
				-To demonstrate a slightly more complex query and the associated results, we can profile the following query:
			
 
				+To demonstrate a slightly more complex query and the associated results, we can 
			
 
				+profile the following query:
			
 
				 
			
 
				 [source,console]
			
 
				 --------------------------------------------------
			
@@ -550,7 +589,8 @@ This example has:
 
				 - A global aggregation
			
 
				 - A post_filter
			
 
				 
			
 
				-And the response:
			
 
				+
			
 
				+The API returns the following result:
			
 
				 
			
 
				 [source,console-result]
			
 
				 --------------------------------------------------
			
@@ -660,47 +700,58 @@ And the response:
 
				 // TESTRESPONSE[s/\.\.\.//]
			
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 // TESTRESPONSE[s/"id": "\[P6-vulHtQRWuD4YnubWb7A\]\[test\]\[0\]"/"id": $body.profile.shards.0.id/]
			
 
				-<1> The `"aggregations"` portion has been omitted because it will be covered in the next section
			
 
				+<1> The `"aggregations"` portion has been omitted because it will be covered in 
			
 
				+the next section.
			
 
				 
			
 
				-As you can see, the output is significantly more verbose than before.  All the major portions of the query are
			
 
				-represented:
			
 
				+As you can see, the output is significantly more verbose than before.  All the 
			
 
				+major portions of the query are represented:
			
 
				 
			
 
				-1. The first `TermQuery` (user:test) represents the main `term` query
			
 
				-2. The second `TermQuery` (message:some) represents the `post_filter` query
			
 
				+1. The first `TermQuery` (user:test) represents the main `term` query.
			
 
				+2. The second `TermQuery` (message:some) represents the `post_filter` query.
			
 
				 
			
 
				-The Collector tree is fairly straightforward, showing how a single CancellableCollector wraps a MultiCollector
			
 
				- which also wraps a FilteredCollector to execute the post_filter (and in turn wraps the normal scoring SimpleCollector),
			
 
				- a BucketCollector to run all scoped aggregations.
			
 
				+The Collector tree is fairly straightforward, showing how a single 
			
 
				+CancellableCollector wraps a MultiCollector which also wraps a FilteredCollector 
			
 
				+to execute the post_filter (and in turn wraps the normal scoring 
			
 
				+SimpleCollector), a BucketCollector to run all scoped aggregations.
			
 
				 
			
 
				 ===== Understanding MultiTermQuery output
			
 
				 
			
 
				-A special note needs to be made about the `MultiTermQuery` class of queries.  This includes wildcards, regex, and fuzzy
			
 
				-queries.  These queries emit very verbose responses, and are not overly structured.
			
 
				-
			
 
				-Essentially, these queries rewrite themselves on a per-segment basis.  If you imagine the wildcard query `b*`, it technically
			
 
				-can match any token that begins with the letter "b".  It would be impossible to enumerate all possible combinations,
			
 
				-so Lucene rewrites the query in context of the segment being evaluated, e.g., one segment may contain the tokens
			
 
				-`[bar, baz]`, so the query rewrites to a BooleanQuery combination of "bar" and "baz".  Another segment may only have the
			
 
				-token `[bakery]`, so the query rewrites to a single TermQuery for "bakery".
			
 
				-
			
 
				-Due to this dynamic, per-segment rewriting, the clean tree structure becomes distorted and no longer follows a clean
			
 
				-"lineage" showing how one query rewrites into the next.  At present time, all we can do is apologize, and suggest you
			
 
				-collapse the details for that query's children if it is too confusing.  Luckily, all the timing statistics are correct,
			
 
				-just not the physical layout in the response, so it is sufficient to just analyze the top-level MultiTermQuery and
			
 
				+A special note needs to be made about the `MultiTermQuery` class of queries. 
			
 
				+This includes wildcards, regex, and fuzzy queries. These queries emit very 
			
 
				+verbose responses, and are not overly structured.
			
 
				+
			
 
				+Essentially, these queries rewrite themselves on a per-segment basis. If you 
			
 
				+imagine the wildcard query `b*`, it technically can match any token that begins 
			
 
				+with the letter "b".  It would be impossible to enumerate all possible 
			
 
				+combinations, so Lucene rewrites the query in context of the segment being 
			
 
				+evaluated, e.g., one segment may contain the tokens `[bar, baz]`, so the query 
			
 
				+rewrites to a BooleanQuery combination of "bar" and "baz".  Another segment may 
			
 
				+only have the token `[bakery]`, so the query rewrites to a single TermQuery for 
			
 
				+"bakery".
			
 
				+
			
 
				+Due to this dynamic, per-segment rewriting, the clean tree structure becomes 
			
 
				+distorted and no longer follows a clean "lineage" showing how one query rewrites 
			
 
				+into the next.  At present time, all we can do is apologize, and suggest you 
			
 
				+collapse the details for that query's children if it is too confusing. Luckily, 
			
 
				+all the timing statistics are correct, just not the physical layout in the 
			
 
				+response, so it is sufficient to just analyze the top-level MultiTermQuery and
			
 
				 ignore its children if you find the details too tricky to interpret.
			
 
				 
			
 
				-Hopefully this will be fixed in future iterations, but it is a tricky problem to solve and still in-progress :)
			
 
				+Hopefully this will be fixed in future iterations, but it is a tricky problem to 
			
 
				+solve and still in-progress. :)
			
 
				 
			
 
				 [[profiling-aggregations]]
			
 
				-==== Profiling Aggregations
			
 
				+===== Profiling Aggregations
			
 
				+
			
 
				 
			
 
				 [[agg-section]]
			
 
				-===== `aggregations` Section
			
 
				+====== `aggregations` Section
			
 
				 
			
 
				 
			
 
				-The `aggregations` section contains detailed timing of the aggregation tree executed by a particular shard.
			
 
				-The overall structure of this aggregation tree will resemble your original Elasticsearch request.  Let's
			
 
				-execute the previous query again and look at the aggregation profile this time:
			
 
				+The `aggregations` section contains detailed timing of the aggregation tree 
			
 
				+executed by a particular shard. The overall structure of this aggregation tree 
			
 
				+will resemble your original {es} request. Let's execute the previous query again 
			
 
				+and look at the aggregation profile this time:
			
 
				 
			
 
				 [source,console]
			
 
				 --------------------------------------------------
			
@@ -741,6 +792,7 @@ GET /twitter/_search
 
				 // TEST[s/_search/_search\?filter_path=profile.shards.aggregations/]
			
 
				 // TEST[continued]
			
 
				 
			
 
				+
			
 
				 This yields the following aggregation profile output:
			
 
				 
			
 
				 [source,console-result]
			
@@ -807,16 +859,20 @@ This yields the following aggregation profile output:
 
				 // TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
			
 
				 // TESTRESPONSE[s/"id": "\[P6-vulHtQRWuD4YnubWb7A\]\[test\]\[0\]"/"id": $body.profile.shards.0.id/]
			
 
				 
			
 
				-From the profile structure we can see that the `my_scoped_agg` is internally being run as a `LongTermsAggregator` (because the field it is
			
 
				-aggregating, `likes`, is a numeric field).  At the same level, we see a `GlobalAggregator` which comes from `my_global_agg`.  That
			
 
				-aggregation then has a child `LongTermsAggregator` which comes from the second term's aggregation on `likes`.
			
 
				+From the profile structure we can see that the `my_scoped_agg` is internally 
			
 
				+being run as a `LongTermsAggregator` (because the field it is aggregating, 
			
 
				+`likes`, is a numeric field). At the same level, we see a `GlobalAggregator` 
			
 
				+which comes from `my_global_agg`. That aggregation then has a child 
			
 
				+`LongTermsAggregator` which comes from the second term's aggregation on `likes`.
			
 
				 
			
 
				-The `time_in_nanos` field shows the time executed by each aggregation, and is inclusive of all children.  While the overall time is useful,
			
 
				-the `breakdown` field will give detailed stats about how the time was spent.
			
 
				+The `time_in_nanos` field shows the time executed by each aggregation, and is 
			
 
				+inclusive of all children.  While the overall time is useful, the `breakdown` 
			
 
				+field will give detailed stats about how the time was spent.
			
 
				 
			
 
				-====== Timing Breakdown
			
 
				+===== Timing Breakdown
			
 
				 
			
 
				-The `breakdown` component lists detailed timing statistics about low-level Lucene execution:
			
 
				+The `breakdown` component lists detailed timing statistics about low-level 
			
 
				+Lucene execution:
			
 
				 
			
 
				 [source,js]
			
 
				 --------------------------------------------------
			
@@ -834,10 +890,11 @@ The `breakdown` component lists detailed timing statistics about low-level Lucen
 
				 --------------------------------------------------
			
 
				 // NOTCONSOLE
			
 
				 
			
 
				-Timings are listed in wall-clock nanoseconds and are not normalized at all.  All caveats about the overall
			
 
				-`time` apply here.  The intention of the breakdown is to give you a feel for A) what machinery in Elasticsearch is
			
 
				-actually eating time, and B) the magnitude of differences in times between the various components.  Like the overall time,
			
 
				-the breakdown is inclusive of all children times.
			
 
				+Timings are listed in wall-clock nanoseconds and are not normalized at all. All 
			
 
				+caveats about the overall `time` apply here.  The intention of the breakdown is 
			
 
				+to give you a feel for A) what machinery in {es} is actually eating time, and B) 
			
 
				+the magnitude of differences in times between the various components. Like the 
			
 
				+overall time, the breakdown is inclusive of all children times.
			
 
				 
			
 
				 The meaning of the stats are as follows:
			
 
				 
			
@@ -866,28 +923,33 @@ The meaning of the stats are as follows:
 
				     means the `collect()` method was called on two different documents.
			
 
				 
			
 
				 [[profiling-considerations]]
			
 
				-==== Profiling Considerations
			
 
				-
			
 
				-===== Performance Notes
			
 
				+===== Profiling Considerations
			
 
				 
			
 
				-Like any profiler, the Profile API introduces a non-negligible overhead to search execution.  The act of instrumenting
			
 
				-low-level method calls such as `collect`, `advance`, and `next_doc` can be fairly expensive, since these methods are called
			
 
				-in tight loops.  Therefore, profiling should not be enabled in production settings by default, and should not
			
 
				-be compared against non-profiled query times.  Profiling is just a diagnostic tool.
			
 
				+Like any profiler, the Profile API introduces a non-negligible overhead to 
			
 
				+search execution. The act of instrumenting low-level method calls such as 
			
 
				+`collect`, `advance`, and `next_doc` can be fairly expensive, since these 
			
 
				+methods are called in tight loops.  Therefore, profiling should not be enabled 
			
 
				+in production settings by default, and should not be compared against 
			
 
				+non-profiled query times. Profiling is just a diagnostic tool.
			
 
				 
			
 
				-There are also cases where special Lucene optimizations are disabled, since they are not amenable to profiling.  This
			
 
				-could cause some queries to report larger relative times than their non-profiled counterparts, but in general should
			
 
				-not have a drastic effect compared to other components in the profiled query.
			
 
				+There are also cases where special Lucene optimizations are disabled, since they 
			
 
				+are not amenable to profiling. This could cause some queries to report larger 
			
 
				+relative times than their non-profiled counterparts, but in general should not 
			
 
				+have a drastic effect compared to other components in the profiled query.
			
 
				 
			
 
				 [[profile-limitations]]
			
 
				 ===== Limitations
			
 
				 
			
 
				-- Profiling currently does not measure the search fetch phase nor the network overhead
			
 
				-- Profiling also does not account for time spent in the queue, merging shard responses on the coordinating node, or
			
 
				-additional work such as building global ordinals (an internal data structure used to speed up search)
			
 
				-- Profiling statistics are currently not available for suggestions, highlighting, `dfs_query_then_fetch`
			
 
				-- Profiling of the reduce phase of aggregation is currently not available
			
 
				-- The Profiler is still highly experimental. The Profiler is instrumenting parts of Lucene that were
			
 
				-never designed to be exposed in this manner, and so all results should be viewed as a best effort to provide detailed
			
 
				-diagnostics.  We hope to improve this over time. If you find obviously wrong numbers, strange query structures, or
			
 
				-other bugs, please report them!
			
 
				+- Profiling currently does not measure the search fetch phase nor the network 
			
 
				+overhead.
			
 
				+- Profiling also does not account for time spent in the queue, merging shard 
			
 
				+responses on the coordinating node, or additional work such as building global 
			
 
				+ordinals (an internal data structure used to speed up search).
			
 
				+- Profiling statistics are currently not available for suggestions, 
			
 
				+highlighting, `dfs_query_then_fetch`.
			
 
				+- Profiling of the reduce phase of aggregation is currently not available.
			
 
				+- The Profiler is still highly experimental. The Profiler is instrumenting parts 
			
 
				+of Lucene that were never designed to be exposed in this manner, and so all 
			
 
				+results should be viewed as a best effort to provide detailed diagnostics. We 
			
 
				+hope to improve this over time. If you find obviously wrong numbers, strange 
			
 
				+query structures, or other bugs, please report them!