|
@@ -1,20 +1,30 @@
|
|
|
[[search-profile]]
|
|
|
=== Profile API
|
|
|
|
|
|
-WARNING: The Profile API is a debugging tool and adds significant overhead to search execution.
|
|
|
+WARNING: The Profile API is a debugging tool and adds significant overhead to search execution.
|
|
|
|
|
|
-The Profile API provides detailed timing information about the execution of individual components
|
|
|
-in a search request. It gives the user insight into how search requests are executed at a low level so that
|
|
|
-the user can understand why certain requests are slow, and take steps to improve them.
|
|
|
-Note that the Profile API, <<profile-limitations, amongst other things>>, doesn't measure
|
|
|
-network latency, time spent in the search fetch phase, time spent while the requests spends
|
|
|
-in queues or while merging shard responses on the coordinating node.
|
|
|
+Provides detailed timing information about the execution of individual
|
|
|
+components in a search request.
|
|
|
|
|
|
-The output from the Profile API is *very* verbose, especially for complicated requests executed across
|
|
|
-many shards. Pretty-printing the response is recommended to help understand the output
|
|
|
|
|
|
-[float]
|
|
|
-==== Usage
|
|
|
+[[search-profile-api-desc]]
|
|
|
+==== {api-description-title}
|
|
|
+
|
|
|
+The Profile API gives the user insight into how search requests are executed at
|
|
|
+a low level so that the user can understand why certain requests are slow, and
|
|
|
+take steps to improve them. Note that the Profile API,
|
|
|
+<<profile-limitations, amongst other things>>, doesn't measure network latency,
|
|
|
+time spent in the search fetch phase, time spent while the requests spends in
|
|
|
+queues or while merging shard responses on the coordinating node.
|
|
|
+
|
|
|
+The output from the Profile API is *very* verbose, especially for complicated
|
|
|
+requests executed across many shards. Pretty-printing the response is
|
|
|
+recommended to help understand the output.
|
|
|
+
|
|
|
+
|
|
|
+[[search-profile-api-example]]
|
|
|
+==== {api-examples-title}
|
|
|
+
|
|
|
|
|
|
Any `_search` request can be profiled by adding a top-level `profile` parameter:
|
|
|
|
|
@@ -31,9 +41,10 @@ GET /twitter/_search
|
|
|
// TEST[setup:twitter]
|
|
|
|
|
|
<1> Setting the top-level `profile` parameter to `true` will enable profiling
|
|
|
-for the search
|
|
|
+for the search.
|
|
|
+
|
|
|
|
|
|
-This will yield the following result:
|
|
|
+The API returns the following result:
|
|
|
|
|
|
[source,console-result]
|
|
|
--------------------------------------------------
|
|
@@ -167,12 +178,13 @@ This will yield the following result:
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
// TESTRESPONSE[s/\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[0\]/$body.$_path/]
|
|
|
|
|
|
-<1> Search results are returned, but were omitted here for brevity
|
|
|
+<1> Search results are returned, but were omitted here for brevity.
|
|
|
|
|
|
-Even for a simple query, the response is relatively complicated. Let's break it down piece-by-piece before moving
|
|
|
-to more complex examples.
|
|
|
+Even for a simple query, the response is relatively complicated. Let's break it
|
|
|
+down piece-by-piece before moving to more complex examples.
|
|
|
|
|
|
-First, the overall structure of the profile response is as follows:
|
|
|
+
|
|
|
+The overall structure of the profile response is as follows:
|
|
|
|
|
|
[source,console-result]
|
|
|
--------------------------------------------------
|
|
@@ -200,27 +212,37 @@ First, the overall structure of the profile response is as follows:
|
|
|
// TESTRESPONSE[s/"query": \[...\]/"query": $body.$_path/]
|
|
|
// TESTRESPONSE[s/"collector": \[...\]/"collector": $body.$_path/]
|
|
|
// TESTRESPONSE[s/"aggregations": \[...\]/"aggregations": []/]
|
|
|
-<1> A profile is returned for each shard that participated in the response, and is identified
|
|
|
-by a unique ID
|
|
|
-<2> Each profile contains a section which holds details about the query execution
|
|
|
-<3> Each profile has a single time representing the cumulative rewrite time
|
|
|
-<4> Each profile also contains a section about the Lucene Collectors which run the search
|
|
|
-<5> Each profile contains a section which holds the details about the aggregation execution
|
|
|
-
|
|
|
-Because a search request may be executed against one or more shards in an index, and a search may cover
|
|
|
-one or more indices, the top level element in the profile response is an array of `shard` objects.
|
|
|
-Each shard object lists its `id` which uniquely identifies the shard. The ID's format is
|
|
|
+<1> A profile is returned for each shard that participated in the response, and
|
|
|
+is identified by a unique ID.
|
|
|
+<2> Each profile contains a section which holds details about the query
|
|
|
+execution.
|
|
|
+<3> Each profile has a single time representing the cumulative rewrite time.
|
|
|
+<4> Each profile also contains a section about the Lucene Collectors which run
|
|
|
+the search.
|
|
|
+<5> Each profile contains a section which holds the details about the
|
|
|
+aggregation execution.
|
|
|
+
|
|
|
+Because a search request may be executed against one or more shards in an index,
|
|
|
+and a search may cover one or more indices, the top level element in the profile
|
|
|
+response is an array of `shard` objects. Each shard object lists its `id` which
|
|
|
+uniquely identifies the shard. The ID's format is
|
|
|
`[nodeID][indexName][shardID]`.
|
|
|
|
|
|
-The profile itself may consist of one or more "searches", where a search is a query executed against the underlying
|
|
|
-Lucene index. Most search requests submitted by the user will only execute a single `search` against the Lucene index.
|
|
|
-But occasionally multiple searches will be executed, such as including a global aggregation (which needs to execute
|
|
|
-a secondary "match_all" query for the global context).
|
|
|
+The profile itself may consist of one or more "searches", where a search is a
|
|
|
+query executed against the underlying Lucene index. Most search requests
|
|
|
+submitted by the user will only execute a single `search` against the Lucene
|
|
|
+index. But occasionally multiple searches will be executed, such as including a
|
|
|
+global aggregation (which needs to execute a secondary "match_all" query for the
|
|
|
+global context).
|
|
|
|
|
|
Inside each `search` object there will be two arrays of profiled information:
|
|
|
-a `query` array and a `collector` array. Alongside the `search` object is an `aggregations` object that contains the profile information for the aggregations. In the future, more sections may be added, such as `suggest`, `highlight`, etc.
|
|
|
+a `query` array and a `collector` array. Alongside the `search` object is an
|
|
|
+`aggregations` object that contains the profile information for the
|
|
|
+aggregations. In the future, more sections may be added, such as `suggest`,
|
|
|
+`highlight`, etc.
|
|
|
|
|
|
-There will also be a `rewrite` metric showing the total time spent rewriting the query (in nanoseconds).
|
|
|
+There will also be a `rewrite` metric showing the total time spent rewriting the
|
|
|
+query (in nanoseconds).
|
|
|
|
|
|
NOTE: As with other statistics apis, the Profile API supports human readable outputs. This can be turned on by adding
|
|
|
`?human=true` to the query string. In this case, the output contains the additional `time` field containing rounded,
|
|
@@ -245,10 +267,11 @@ the `advance` phase of that query is the cause, for example.
|
|
|
[[query-section]]
|
|
|
===== `query` Section
|
|
|
|
|
|
-The `query` section contains detailed timing of the query tree executed by Lucene on a particular shard.
|
|
|
-The overall structure of this query tree will resemble your original Elasticsearch query, but may be slightly
|
|
|
-(or sometimes very) different. It will also use similar but not always identical naming. Using our previous
|
|
|
-`match` query example, let's analyze the `query` section:
|
|
|
+The `query` section contains detailed timing of the query tree executed by
|
|
|
+Lucene on a particular shard. The overall structure of this query tree will
|
|
|
+resemble your original Elasticsearch query, but may be slightly (or sometimes
|
|
|
+very) different. It will also use similar but not always identical naming.
|
|
|
+Using our previous `match` query example, let's analyze the `query` section:
|
|
|
|
|
|
[source,console-result]
|
|
|
--------------------------------------------------
|
|
@@ -279,25 +302,30 @@ The overall structure of this query tree will resemble your original Elasticsear
|
|
|
// TESTRESPONSE[s/]$/],"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
// TESTRESPONSE[s/"breakdown": \{...\}/"breakdown": $body.$_path/]
|
|
|
-<1> The breakdown timings are omitted for simplicity
|
|
|
+<1> The breakdown timings are omitted for simplicity.
|
|
|
|
|
|
-Based on the profile structure, we can see that our `match` query was rewritten by Lucene into a BooleanQuery with two
|
|
|
-clauses (both holding a TermQuery). The `type` field displays the Lucene class name, and often aligns with
|
|
|
-the equivalent name in Elasticsearch. The `description` field displays the Lucene explanation text for the query, and
|
|
|
-is made available to help differentiating between parts of your query (e.g. both `message:search` and `message:test`
|
|
|
-are TermQuery's and would appear identical otherwise.
|
|
|
+Based on the profile structure, we can see that our `match` query was rewritten
|
|
|
+by Lucene into a BooleanQuery with two clauses (both holding a TermQuery). The
|
|
|
+`type` field displays the Lucene class name, and often aligns with the
|
|
|
+equivalent name in Elasticsearch. The `description` field displays the Lucene
|
|
|
+explanation text for the query, and is made available to help differentiating
|
|
|
+between parts of your query (e.g. both `message:search` and `message:test` are
|
|
|
+TermQuery's and would appear identical otherwise.
|
|
|
|
|
|
-The `time_in_nanos` field shows that this query took ~1.8ms for the entire BooleanQuery to execute. The recorded time is inclusive
|
|
|
-of all children.
|
|
|
+The `time_in_nanos` field shows that this query took ~1.8ms for the entire
|
|
|
+BooleanQuery to execute. The recorded time is inclusive of all children.
|
|
|
|
|
|
-The `breakdown` field will give detailed stats about how the time was spent, we'll look at
|
|
|
-that in a moment. Finally, the `children` array lists any sub-queries that may be present. Because we searched for two
|
|
|
-values ("search test"), our BooleanQuery holds two children TermQueries. They have identical information (type, time,
|
|
|
-breakdown, etc). Children are allowed to have their own children.
|
|
|
+The `breakdown` field will give detailed stats about how the time was spent,
|
|
|
+we'll look at that in a moment. Finally, the `children` array lists any
|
|
|
+sub-queries that may be present. Because we searched for two values ("search
|
|
|
+test"), our BooleanQuery holds two children TermQueries. They have identical
|
|
|
+information (type, time, breakdown, etc). Children are allowed to have their
|
|
|
+own children.
|
|
|
|
|
|
-====== Timing Breakdown
|
|
|
+===== Timing Breakdown
|
|
|
|
|
|
-The `breakdown` component lists detailed timing statistics about low-level Lucene execution:
|
|
|
+The `breakdown` component lists detailed timing statistics about low-level
|
|
|
+Lucene execution:
|
|
|
|
|
|
[source,console-result]
|
|
|
--------------------------------------------------
|
|
@@ -326,10 +354,12 @@ The `breakdown` component lists detailed timing statistics about low-level Lucen
|
|
|
// TESTRESPONSE[s/}$/},\n"children": $body.$_path}],\n"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
|
|
|
-Timings are listed in wall-clock nanoseconds and are not normalized at all. All caveats about the overall
|
|
|
-`time_in_nanos` apply here. The intention of the breakdown is to give you a feel for A) what machinery in Lucene is
|
|
|
-actually eating time, and B) the magnitude of differences in times between the various components. Like the overall time,
|
|
|
-the breakdown is inclusive of all children times.
|
|
|
+Timings are listed in wall-clock nanoseconds and are not normalized at all. All
|
|
|
+caveats about the overall `time_in_nanos` apply here. The intention of the
|
|
|
+breakdown is to give you a feel for A) what machinery in Lucene is actually
|
|
|
+eating time, and B) the magnitude of differences in times between the various
|
|
|
+components. Like the overall time, the breakdown is inclusive of all children
|
|
|
+times.
|
|
|
|
|
|
The meaning of the stats are as follows:
|
|
|
|
|
@@ -399,13 +429,15 @@ The meaning of the stats are as follows:
|
|
|
means the `nextDoc()` method was called on two different documents. This can be used to help judge
|
|
|
how selective queries are, by comparing counts between different query components.
|
|
|
|
|
|
+
|
|
|
[[collectors-section]]
|
|
|
===== `collectors` Section
|
|
|
|
|
|
-The Collectors portion of the response shows high-level execution details. Lucene works by defining a "Collector"
|
|
|
-which is responsible for coordinating the traversal, scoring, and collection of matching documents. Collectors
|
|
|
-are also how a single query can record aggregation results, execute unscoped "global" queries, execute post-query
|
|
|
-filters, etc.
|
|
|
+The Collectors portion of the response shows high-level execution details.
|
|
|
+Lucene works by defining a "Collector" which is responsible for coordinating the
|
|
|
+traversal, scoring, and collection of matching documents. Collectors are also
|
|
|
+how a single query can record aggregation results, execute unscoped "global"
|
|
|
+queries, execute post-query filters, etc.
|
|
|
|
|
|
Looking at the previous example:
|
|
|
|
|
@@ -430,15 +462,20 @@ Looking at the previous example:
|
|
|
// TESTRESPONSE[s/]$/]}], "aggregations": []}]}}/]
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
|
|
|
-We see a single collector named `SimpleTopScoreDocCollector` wrapped into `CancellableCollector`. `SimpleTopScoreDocCollector` is the default "scoring and sorting"
|
|
|
-`Collector` used by Elasticsearch. The `reason` field attempts to give a plain English description of the class name. The
|
|
|
-`time_in_nanos` is similar to the time in the Query tree: a wall-clock time inclusive of all children. Similarly, `children` lists
|
|
|
-all sub-collectors. The `CancellableCollector` that wraps `SimpleTopScoreDocCollector` is used by Elasticsearch to detect if the current
|
|
|
-search was cancelled and stop collecting documents as soon as it occurs.
|
|
|
|
|
|
-It should be noted that Collector times are **independent** from the Query times. They are calculated, combined,
|
|
|
-and normalized independently! Due to the nature of Lucene's execution, it is impossible to "merge" the times
|
|
|
-from the Collectors into the Query section, so they are displayed in separate portions.
|
|
|
+We see a single collector named `SimpleTopScoreDocCollector` wrapped into
|
|
|
+`CancellableCollector`. `SimpleTopScoreDocCollector` is the default "scoring and
|
|
|
+sorting" `Collector` used by {es}. The `reason` field attempts to give a plain
|
|
|
+English description of the class name. The `time_in_nanos` is similar to the
|
|
|
+time in the Query tree: a wall-clock time inclusive of all children. Similarly,
|
|
|
+`children` lists all sub-collectors. The `CancellableCollector` that wraps
|
|
|
+`SimpleTopScoreDocCollector` is used by {es} to detect if the current search was
|
|
|
+cancelled and stop collecting documents as soon as it occurs.
|
|
|
+
|
|
|
+It should be noted that Collector times are **independent** from the Query
|
|
|
+times. They are calculated, combined, and normalized independently! Due to the
|
|
|
+nature of Lucene's execution, it is impossible to "merge" the times from the
|
|
|
+Collectors into the Query section, so they are displayed in separate portions.
|
|
|
|
|
|
For reference, the various collector reasons are:
|
|
|
|
|
@@ -489,20 +526,22 @@ For reference, the various collector reasons are:
|
|
|
[[rewrite-section]]
|
|
|
===== `rewrite` Section
|
|
|
|
|
|
-All queries in Lucene undergo a "rewriting" process. A query (and its sub-queries) may be rewritten one or
|
|
|
-more times, and the process continues until the query stops changing. This process allows Lucene to perform
|
|
|
-optimizations, such as removing redundant clauses, replacing one query for a more efficient execution path,
|
|
|
-etc. For example a Boolean -> Boolean -> TermQuery can be rewritten to a TermQuery, because all the Booleans
|
|
|
-are unnecessary in this case.
|
|
|
+All queries in Lucene undergo a "rewriting" process. A query (and its
|
|
|
+sub-queries) may be rewritten one or more times, and the process continues until
|
|
|
+the query stops changing. This process allows Lucene to perform optimizations,
|
|
|
+such as removing redundant clauses, replacing one query for a more efficient
|
|
|
+execution path, etc. For example a Boolean -> Boolean -> TermQuery can be
|
|
|
+rewritten to a TermQuery, because all the Booleans are unnecessary in this case.
|
|
|
|
|
|
-The rewriting process is complex and difficult to display, since queries can change drastically. Rather than
|
|
|
-showing the intermediate results, the total rewrite time is simply displayed as a value (in nanoseconds). This
|
|
|
-value is cumulative and contains the total time for all queries being rewritten.
|
|
|
+The rewriting process is complex and difficult to display, since queries can
|
|
|
+change drastically. Rather than showing the intermediate results, the total
|
|
|
+rewrite time is simply displayed as a value (in nanoseconds). This value is
|
|
|
+cumulative and contains the total time for all queries being rewritten.
|
|
|
|
|
|
===== A more complex example
|
|
|
|
|
|
-
|
|
|
-To demonstrate a slightly more complex query and the associated results, we can profile the following query:
|
|
|
+To demonstrate a slightly more complex query and the associated results, we can
|
|
|
+profile the following query:
|
|
|
|
|
|
[source,console]
|
|
|
--------------------------------------------------
|
|
@@ -550,7 +589,8 @@ This example has:
|
|
|
- A global aggregation
|
|
|
- A post_filter
|
|
|
|
|
|
-And the response:
|
|
|
+
|
|
|
+The API returns the following result:
|
|
|
|
|
|
[source,console-result]
|
|
|
--------------------------------------------------
|
|
@@ -660,47 +700,58 @@ And the response:
|
|
|
// TESTRESPONSE[s/\.\.\.//]
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
// TESTRESPONSE[s/"id": "\[P6-vulHtQRWuD4YnubWb7A\]\[test\]\[0\]"/"id": $body.profile.shards.0.id/]
|
|
|
-<1> The `"aggregations"` portion has been omitted because it will be covered in the next section
|
|
|
+<1> The `"aggregations"` portion has been omitted because it will be covered in
|
|
|
+the next section.
|
|
|
|
|
|
-As you can see, the output is significantly more verbose than before. All the major portions of the query are
|
|
|
-represented:
|
|
|
+As you can see, the output is significantly more verbose than before. All the
|
|
|
+major portions of the query are represented:
|
|
|
|
|
|
-1. The first `TermQuery` (user:test) represents the main `term` query
|
|
|
-2. The second `TermQuery` (message:some) represents the `post_filter` query
|
|
|
+1. The first `TermQuery` (user:test) represents the main `term` query.
|
|
|
+2. The second `TermQuery` (message:some) represents the `post_filter` query.
|
|
|
|
|
|
-The Collector tree is fairly straightforward, showing how a single CancellableCollector wraps a MultiCollector
|
|
|
- which also wraps a FilteredCollector to execute the post_filter (and in turn wraps the normal scoring SimpleCollector),
|
|
|
- a BucketCollector to run all scoped aggregations.
|
|
|
+The Collector tree is fairly straightforward, showing how a single
|
|
|
+CancellableCollector wraps a MultiCollector which also wraps a FilteredCollector
|
|
|
+to execute the post_filter (and in turn wraps the normal scoring
|
|
|
+SimpleCollector), a BucketCollector to run all scoped aggregations.
|
|
|
|
|
|
===== Understanding MultiTermQuery output
|
|
|
|
|
|
-A special note needs to be made about the `MultiTermQuery` class of queries. This includes wildcards, regex, and fuzzy
|
|
|
-queries. These queries emit very verbose responses, and are not overly structured.
|
|
|
-
|
|
|
-Essentially, these queries rewrite themselves on a per-segment basis. If you imagine the wildcard query `b*`, it technically
|
|
|
-can match any token that begins with the letter "b". It would be impossible to enumerate all possible combinations,
|
|
|
-so Lucene rewrites the query in context of the segment being evaluated, e.g., one segment may contain the tokens
|
|
|
-`[bar, baz]`, so the query rewrites to a BooleanQuery combination of "bar" and "baz". Another segment may only have the
|
|
|
-token `[bakery]`, so the query rewrites to a single TermQuery for "bakery".
|
|
|
-
|
|
|
-Due to this dynamic, per-segment rewriting, the clean tree structure becomes distorted and no longer follows a clean
|
|
|
-"lineage" showing how one query rewrites into the next. At present time, all we can do is apologize, and suggest you
|
|
|
-collapse the details for that query's children if it is too confusing. Luckily, all the timing statistics are correct,
|
|
|
-just not the physical layout in the response, so it is sufficient to just analyze the top-level MultiTermQuery and
|
|
|
+A special note needs to be made about the `MultiTermQuery` class of queries.
|
|
|
+This includes wildcards, regex, and fuzzy queries. These queries emit very
|
|
|
+verbose responses, and are not overly structured.
|
|
|
+
|
|
|
+Essentially, these queries rewrite themselves on a per-segment basis. If you
|
|
|
+imagine the wildcard query `b*`, it technically can match any token that begins
|
|
|
+with the letter "b". It would be impossible to enumerate all possible
|
|
|
+combinations, so Lucene rewrites the query in context of the segment being
|
|
|
+evaluated, e.g., one segment may contain the tokens `[bar, baz]`, so the query
|
|
|
+rewrites to a BooleanQuery combination of "bar" and "baz". Another segment may
|
|
|
+only have the token `[bakery]`, so the query rewrites to a single TermQuery for
|
|
|
+"bakery".
|
|
|
+
|
|
|
+Due to this dynamic, per-segment rewriting, the clean tree structure becomes
|
|
|
+distorted and no longer follows a clean "lineage" showing how one query rewrites
|
|
|
+into the next. At present time, all we can do is apologize, and suggest you
|
|
|
+collapse the details for that query's children if it is too confusing. Luckily,
|
|
|
+all the timing statistics are correct, just not the physical layout in the
|
|
|
+response, so it is sufficient to just analyze the top-level MultiTermQuery and
|
|
|
ignore its children if you find the details too tricky to interpret.
|
|
|
|
|
|
-Hopefully this will be fixed in future iterations, but it is a tricky problem to solve and still in-progress :)
|
|
|
+Hopefully this will be fixed in future iterations, but it is a tricky problem to
|
|
|
+solve and still in-progress. :)
|
|
|
|
|
|
[[profiling-aggregations]]
|
|
|
-==== Profiling Aggregations
|
|
|
+===== Profiling Aggregations
|
|
|
+
|
|
|
|
|
|
[[agg-section]]
|
|
|
-===== `aggregations` Section
|
|
|
+====== `aggregations` Section
|
|
|
|
|
|
|
|
|
-The `aggregations` section contains detailed timing of the aggregation tree executed by a particular shard.
|
|
|
-The overall structure of this aggregation tree will resemble your original Elasticsearch request. Let's
|
|
|
-execute the previous query again and look at the aggregation profile this time:
|
|
|
+The `aggregations` section contains detailed timing of the aggregation tree
|
|
|
+executed by a particular shard. The overall structure of this aggregation tree
|
|
|
+will resemble your original {es} request. Let's execute the previous query again
|
|
|
+and look at the aggregation profile this time:
|
|
|
|
|
|
[source,console]
|
|
|
--------------------------------------------------
|
|
@@ -741,6 +792,7 @@ GET /twitter/_search
|
|
|
// TEST[s/_search/_search\?filter_path=profile.shards.aggregations/]
|
|
|
// TEST[continued]
|
|
|
|
|
|
+
|
|
|
This yields the following aggregation profile output:
|
|
|
|
|
|
[source,console-result]
|
|
@@ -807,16 +859,20 @@ This yields the following aggregation profile output:
|
|
|
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
|
|
|
// TESTRESPONSE[s/"id": "\[P6-vulHtQRWuD4YnubWb7A\]\[test\]\[0\]"/"id": $body.profile.shards.0.id/]
|
|
|
|
|
|
-From the profile structure we can see that the `my_scoped_agg` is internally being run as a `LongTermsAggregator` (because the field it is
|
|
|
-aggregating, `likes`, is a numeric field). At the same level, we see a `GlobalAggregator` which comes from `my_global_agg`. That
|
|
|
-aggregation then has a child `LongTermsAggregator` which comes from the second term's aggregation on `likes`.
|
|
|
+From the profile structure we can see that the `my_scoped_agg` is internally
|
|
|
+being run as a `LongTermsAggregator` (because the field it is aggregating,
|
|
|
+`likes`, is a numeric field). At the same level, we see a `GlobalAggregator`
|
|
|
+which comes from `my_global_agg`. That aggregation then has a child
|
|
|
+`LongTermsAggregator` which comes from the second term's aggregation on `likes`.
|
|
|
|
|
|
-The `time_in_nanos` field shows the time executed by each aggregation, and is inclusive of all children. While the overall time is useful,
|
|
|
-the `breakdown` field will give detailed stats about how the time was spent.
|
|
|
+The `time_in_nanos` field shows the time executed by each aggregation, and is
|
|
|
+inclusive of all children. While the overall time is useful, the `breakdown`
|
|
|
+field will give detailed stats about how the time was spent.
|
|
|
|
|
|
-====== Timing Breakdown
|
|
|
+===== Timing Breakdown
|
|
|
|
|
|
-The `breakdown` component lists detailed timing statistics about low-level Lucene execution:
|
|
|
+The `breakdown` component lists detailed timing statistics about low-level
|
|
|
+Lucene execution:
|
|
|
|
|
|
[source,js]
|
|
|
--------------------------------------------------
|
|
@@ -834,10 +890,11 @@ The `breakdown` component lists detailed timing statistics about low-level Lucen
|
|
|
--------------------------------------------------
|
|
|
// NOTCONSOLE
|
|
|
|
|
|
-Timings are listed in wall-clock nanoseconds and are not normalized at all. All caveats about the overall
|
|
|
-`time` apply here. The intention of the breakdown is to give you a feel for A) what machinery in Elasticsearch is
|
|
|
-actually eating time, and B) the magnitude of differences in times between the various components. Like the overall time,
|
|
|
-the breakdown is inclusive of all children times.
|
|
|
+Timings are listed in wall-clock nanoseconds and are not normalized at all. All
|
|
|
+caveats about the overall `time` apply here. The intention of the breakdown is
|
|
|
+to give you a feel for A) what machinery in {es} is actually eating time, and B)
|
|
|
+the magnitude of differences in times between the various components. Like the
|
|
|
+overall time, the breakdown is inclusive of all children times.
|
|
|
|
|
|
The meaning of the stats are as follows:
|
|
|
|
|
@@ -866,28 +923,33 @@ The meaning of the stats are as follows:
|
|
|
means the `collect()` method was called on two different documents.
|
|
|
|
|
|
[[profiling-considerations]]
|
|
|
-==== Profiling Considerations
|
|
|
-
|
|
|
-===== Performance Notes
|
|
|
+===== Profiling Considerations
|
|
|
|
|
|
-Like any profiler, the Profile API introduces a non-negligible overhead to search execution. The act of instrumenting
|
|
|
-low-level method calls such as `collect`, `advance`, and `next_doc` can be fairly expensive, since these methods are called
|
|
|
-in tight loops. Therefore, profiling should not be enabled in production settings by default, and should not
|
|
|
-be compared against non-profiled query times. Profiling is just a diagnostic tool.
|
|
|
+Like any profiler, the Profile API introduces a non-negligible overhead to
|
|
|
+search execution. The act of instrumenting low-level method calls such as
|
|
|
+`collect`, `advance`, and `next_doc` can be fairly expensive, since these
|
|
|
+methods are called in tight loops. Therefore, profiling should not be enabled
|
|
|
+in production settings by default, and should not be compared against
|
|
|
+non-profiled query times. Profiling is just a diagnostic tool.
|
|
|
|
|
|
-There are also cases where special Lucene optimizations are disabled, since they are not amenable to profiling. This
|
|
|
-could cause some queries to report larger relative times than their non-profiled counterparts, but in general should
|
|
|
-not have a drastic effect compared to other components in the profiled query.
|
|
|
+There are also cases where special Lucene optimizations are disabled, since they
|
|
|
+are not amenable to profiling. This could cause some queries to report larger
|
|
|
+relative times than their non-profiled counterparts, but in general should not
|
|
|
+have a drastic effect compared to other components in the profiled query.
|
|
|
|
|
|
[[profile-limitations]]
|
|
|
===== Limitations
|
|
|
|
|
|
-- Profiling currently does not measure the search fetch phase nor the network overhead
|
|
|
-- Profiling also does not account for time spent in the queue, merging shard responses on the coordinating node, or
|
|
|
-additional work such as building global ordinals (an internal data structure used to speed up search)
|
|
|
-- Profiling statistics are currently not available for suggestions, highlighting, `dfs_query_then_fetch`
|
|
|
-- Profiling of the reduce phase of aggregation is currently not available
|
|
|
-- The Profiler is still highly experimental. The Profiler is instrumenting parts of Lucene that were
|
|
|
-never designed to be exposed in this manner, and so all results should be viewed as a best effort to provide detailed
|
|
|
-diagnostics. We hope to improve this over time. If you find obviously wrong numbers, strange query structures, or
|
|
|
-other bugs, please report them!
|
|
|
+- Profiling currently does not measure the search fetch phase nor the network
|
|
|
+overhead.
|
|
|
+- Profiling also does not account for time spent in the queue, merging shard
|
|
|
+responses on the coordinating node, or additional work such as building global
|
|
|
+ordinals (an internal data structure used to speed up search).
|
|
|
+- Profiling statistics are currently not available for suggestions,
|
|
|
+highlighting, `dfs_query_then_fetch`.
|
|
|
+- Profiling of the reduce phase of aggregation is currently not available.
|
|
|
+- The Profiler is still highly experimental. The Profiler is instrumenting parts
|
|
|
+of Lucene that were never designed to be exposed in this manner, and so all
|
|
|
+results should be viewed as a best effort to provide detailed diagnostics. We
|
|
|
+hope to improve this over time. If you find obviously wrong numbers, strange
|
|
|
+query structures, or other bugs, please report them!
|