5 years ago · 6dad2662ae
--- a/docs/reference/transform/overview.asciidoc
+++ b/docs/reference/transform/overview.asciidoc
@@ -7,22 +7,19 @@
 
				 
			
 
				 beta[]
			
 
				 
			
 
				-A _{dataframe}_ is a two-dimensional tabular data structure. In the context of
			
 
				-the {stack}, it is a transformation of data that is indexed in {es}. For
			
 
				-example, you can use {dataframes} to _pivot_ your data into a new entity-centric
			
 
				-index. By transforming and summarizing your data, it becomes possible to
			
 
				-visualize and analyze it in alternative and interesting ways.
			
 
				+You can use {transforms} to _pivot_ your data into a new entity-centric index. 
			
 
				+By transforming and summarizing your data, it becomes possible to visualize and 
			
 
				+analyze it in alternative and interesting ways.
			
 
				 
			
 
				 A lot of {es} indices are organized as a stream of events: each event is an 
			
 
				-individual document, for example a single item purchase. {dataframes-cap} enable
			
 
				+individual document, for example a single item purchase. {transforms-cap} enable
			
 
				 you to summarize this data, bringing it into an organized, more
			
 
				 analysis-friendly format. For example, you can summarize all the purchases of a
			
 
				 single customer.
			
 
				 
			
 
				-You can create {dataframes} by using {transforms}.
			
 
				 {transforms-cap} enable you to define a pivot, which is a set of
			
 
				 features that transform the index into a different, more digestible format.
			
 
				-Pivoting results in a summary of your data, which is the {dataframe}.
			
 
				+Pivoting results in a summary of your data in a new index.
			
 
				 
			
 
				 To define a pivot, first you select one or more fields that you will use to
			
 
				 group your data. You can select categorical fields (terms) and numerical fields
			
@@ -38,34 +35,32 @@ more about the supported aggregations and group-by fields, see
 
				 As an optional step, you can also add a query to further limit the scope of the
			
 
				 aggregation.
			
 
				 
			
 
				-The {transform} performs a composite aggregation that 
			
 
				-paginates through all the data defined by the source index query. The output of
			
 
				-the aggregation is stored in a destination index. Each time the 
			
 
				-{transform} queries the source index, it creates a _checkpoint_. You 
			
 
				-can decide whether you want the {transform} to run once (batch 
			
 
				-{transform}) or continuously ({transform}). A batch 
			
 
				-{transform} is a single operation that has a single checkpoint. 
			
 
				-{ctransforms-cap} continually increment and process checkpoints as new 
			
 
				-source data is ingested.
			
 
				+The {transform} performs a composite aggregation that paginates through all the 
			
 
				+data defined by the source index query. The output of the aggregation is stored 
			
 
				+in a destination index. Each time the {transform} queries the source index, it 
			
 
				+creates a _checkpoint_. You can decide whether you want the {transform} to run 
			
 
				+once (batch {transform}) or continuously ({transform}). A batch {transform} is a 
			
 
				+single operation that has a single checkpoint. {ctransforms-cap} continually 
			
 
				+increment and process checkpoints as new source data is ingested.
			
 
				 
			
 
				 .Example
			
 
				 
			
 
				-Imagine that you run a webshop that sells clothes. Every order creates a document 
			
 
				-that contains a unique order ID, the name and the category of the ordered product, 
			
 
				-its price, the ordered quantity, the exact date of the order, and some customer 
			
 
				-information (name, gender, location, etc). Your dataset contains all the transactions 
			
 
				-from last year.
			
 
				+Imagine that you run a webshop that sells clothes. Every order creates a 
			
 
				+document that contains a unique order ID, the name and the category of the 
			
 
				+ordered product, its price, the ordered quantity, the exact date of the order, 
			
 
				+and some customer information (name, gender, location, etc). Your dataset 
			
 
				+contains all the transactions from last year.
			
 
				 
			
 
				 If you want to check the sales in the different categories in your last fiscal
			
 
				-year, define a {transform} that groups the data by the product
			
 
				-categories (women's shoes, men's clothing, etc.) and the order date. Use the
			
 
				-last year as the interval for the order date. Then add a sum aggregation on the
			
 
				-ordered quantity. The result is a {dataframe} that shows the number of sold
			
 
				+year, define a {transform} that groups the data by the product categories 
			
 
				+(women's shoes, men's clothing, etc.) and the order date. Use the last year as 
			
 
				+the interval for the order date. Then add a sum aggregation on the ordered 
			
 
				+quantity. The result is an entity-centric index that shows the number of sold
			
 
				 items in every product category in the last year.
			
 
				 
			
 
				 [role="screenshot"]
			
 
				 image::images/ml-dataframepivot.jpg["Example of a data frame pivot in {kib}"]
			
 
				 
			
 
				 IMPORTANT: The {transform} leaves your source index intact. It
			
 
				-creates a new index that is dedicated to the {dataframe}.
			
 
				+creates a new index that is dedicated to the transformed data.