4 년 전 · 9b99234b4a
--- a/server/src/main/java/org/elasticsearch/search/aggregations/package-info.java
+++ b/server/src/main/java/org/elasticsearch/search/aggregations/package-info.java
@@ -7,6 +7,65 @@
 
				  */
			
 
				 
			
 
				 /**
			
 
				- * Builds analytic information over all hits in a search request.
			
 
				+ * <h2>Aggregations</h2>
			
 
				+ * <p>Builds analytic information over all hits in a search request.  Aggregations
			
 
				+ * are essentially a tool for sumarizing data, and that summary is often used
			
 
				+ * to generate a visualization.</p>
			
 
				+ *
			
 
				+ * <h2>Types of aggregations</h2>
			
 
				+ * There are three main types of aggregations, each in their own sub package:
			
 
				+ * <ul>
			
 
				+ *     <li>Bucket aggregations - which group documents (e.g. a histogram)</li>
			
 
				+ *     <li>Metric aggregations - which compute a summary value from several
			
 
				+ *     documents (e.g. a sum)</li>
			
 
				+ *     <li>Pipeline aggregations - which run as a seperate step and compute
			
 
				+ *     values across buckets</li>
			
 
				+ * </ul>
			
 
				+ * Additionally there is a support sub package, which contains the type checking
			
 
				+ * and resolution logic, primarily.
			
 
				+ *
			
 
				+ * <h2>How Aggregations Work</h2>
			
 
				+ * <p>TODO: Info about search phases goes here</p>
			
 
				+ *
			
 
				+ * <p>Aggregations operate in general as Map Reduce jobs.  The coordinating node for
			
 
				+ * the query dispatches the aggregation to each data node.  The data nodes all
			
 
				+ * instantiate an {@link org.elasticsearch.search.aggregations.AggregationBuilder}
			
 
				+ * of the appropriate type, which in turn builds the
			
 
				+ * {@link org.elasticsearch.search.aggregations.Aggregator} for that node.  This
			
 
				+ * collects the data from that shard, via
			
 
				+ * {@link org.elasticsearch.search.aggregations.Aggregator#getLeafCollector(org.apache.lucene.index.LeafReaderContext)}
			
 
				+ * more or less.  These values are shipped back to the coordinating node, which
			
 
				+ * performs the reduction on them (partial reductions in place on the data nodes
			
 
				+ * are also possible).</p>
			
 
				+ *
			
 
				+ * <h3>Three modes of operation</h3>
			
 
				+ * <p>When it comes to actually collecting values, there are three ways aggregations
			
 
				+ * operate, in general.  Which one we choose depends on limitations in the query
			
 
				+ * and how the data was ingested (e.g. if it is searchable).</p>
			
 
				+ *
			
 
				+ * <p>The easiest to understand is the <strong>Compatible</strong> (i.e. usable in
			
 
				+ * all situations) mode, which can be thought of as iterating each query hit and
			
 
				+ * collecting a value from it.  This is the least performant way to evaluate
			
 
				+ * aggregations, requiring looking at every hit.</p>
			
 
				+ *
			
 
				+ * <p>The fastest way to run an aggregation is by <strong>looking at the index structures
			
 
				+ * directly.</strong>  For example, Lucene just stores the minimum and maximum values
			
 
				+ * of fields per segment, so a min aggregation matching all documents in a segment
			
 
				+ * can just look up its result.  Generally speaking, this mode can be engaged when
			
 
				+ * there are no queries or sub-aggregations, and is gated by
			
 
				+ * {@link org.elasticsearch.search.aggregations.support.ValuesSourceConfig#getPointReaderOrNull()}.</p>
			
 
				+ *
			
 
				+ * <p>Finally, we can <strong>rewrite</strong> an aggregation into faster aggregations,
			
 
				+ * or ideally into just a query.  Generally, the goal here is to get to
			
 
				+ * <strong>filter by filters</strong> (which is an optimization on the filters aggregation
			
 
				+ * which runs it as a set of filter queries).  Often this process will look like rewriting
			
 
				+ * a DateHistogram into a DateRange, and then rewriting the DateRange into Filters.
			
 
				+ * If you see {@link org.elasticsearch.search.aggregations.AdaptingAggregator}, that's
			
 
				+ * a good clue that the rewrite mode is being used.  In general, when we rewrite aggregations,
			
 
				+ * we are able to detect if the rewritten agg can run in a "fast" mode, and decline the
			
 
				+ * rewrite if it can't.</p>
			
 
				+ *
			
 
				+ * <p>In general, aggs will try to use one of the fast modes, and if that's not possible,
			
 
				+ * fall back to running in compatible mode.</p>
			
 
				  */
			
 
				 package org.elasticsearch.search.aggregations;