1 year ago · ea4b6fd3ea
--- a/docs/reference/how-to/knn-search.asciidoc
+++ b/docs/reference/how-to/knn-search.asciidoc
@@ -11,36 +11,14 @@ the indexing algorithm runs searches under the hood to create the vector index
 
				 structures. So these same recommendations also help with indexing speed.
			
 
				 
			
 
				 [discrete]
			
 
				-=== Ensure data nodes have enough memory
			
 
				-
			
 
				-{es} uses the https://arxiv.org/abs/1603.09320[HNSW] algorithm for approximate
			
 
				-kNN search. HNSW is a graph-based algorithm which only works efficiently when
			
 
				-most vector data is held in memory. You should ensure that data nodes have at
			
 
				-least enough RAM to hold the vector data and index structures. To check the
			
 
				-size of the vector data, you can use the <<indices-disk-usage>> API. As a
			
 
				-loose rule of thumb, and assuming the default HNSW options, the bytes used will
			
 
				-be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
			
 
				-the space required will be closer to  `num_vectors * (num_dimensions + 12)`. Note that
			
 
				-the required RAM is for the filesystem cache, which is separate from the Java
			
 
				-heap.
			
 
				-
			
 
				-The data nodes should also leave a buffer for other ways that RAM is needed.
			
 
				-For example your index might also include text fields and numerics, which also
			
 
				-benefit from using filesystem cache. It's recommended to run benchmarks with
			
 
				-your specific dataset to ensure there's a sufficient amount of memory to give
			
 
				-good search performance.
			
 
				-You can find https://elasticsearch-benchmarks.elastic.co/#tracks/so_vector[here]
			
 
				-and https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector[here] some examples
			
 
				-of datasets and configurations that we use for our nightly benchmarks.
			
 
				-
			
 
				-[discrete]
			
 
				-include::search-speed.asciidoc[tag=warm-fs-cache]
			
 
				-
			
 
				-The following file extensions are used for the approximate kNN search:
			
 
				+=== Reduce vector memory foot-print
			
 
				 
			
 
				-* `vec` and `veq` for vector values
			
 
				-* `vex` for HNSW graph
			
 
				-* `vem`, `vemf`, and `vemq` for metadata
			
 
				+The default <<dense-vector-element-type,`element_type`>> is `float`. But this
			
 
				+can be automatically quantized during index time through
			
 
				+<<dense-vector-quantization,`quantization`>>. Quantization will reduce the
			
 
				+required memory by 4x, but it will also reduce the precision of the vectors. For
			
 
				+`float` vectors with `dim` greater than or equal to `384`, using a
			
 
				+<<dense-vector-quantization,`quantized`>> index is highly recommended.
			
 
				 
			
 
				 [discrete]
			
 
				 === Reduce vector dimensionality
			
@@ -54,14 +32,6 @@ reduction techniques like PCA. When experimenting with different approaches,
 
				 it's important to measure the impact on relevance to ensure the search
			
 
				 quality is still acceptable.
			
 
				 
			
 
				-[discrete]
			
 
				-=== Reduce vector memory foot-print
			
 
				-
			
 
				-The default <<dense-vector-element-type,`element_type`>> is `float`. But this can be
			
 
				-automatically quantized during index time through <<dense-vector-quantization,`quantization`>>. Quantization will
			
 
				-reduce the required memory by 4x, but it will also reduce the precision of the vectors. For `float` vectors with
			
 
				-`dim` greater than or equal to `384`, using a <<dense-vector-quantization,`quantized`>> index is highly recommended.
			
 
				-
			
 
				 [discrete]
			
 
				 === Exclude vector fields from `_source`
			
 
				 
			
@@ -82,6 +52,37 @@ downsides of omitting fields from `_source`.
 
				 Another option is to use  <<synthetic-source,synthetic `_source`>> if all
			
 
				 your index fields support it.
			
 
				 
			
 
				+[discrete]
			
 
				+=== Ensure data nodes have enough memory
			
 
				+
			
 
				+{es} uses the https://arxiv.org/abs/1603.09320[HNSW] algorithm for approximate
			
 
				+kNN search. HNSW is a graph-based algorithm which only works efficiently when
			
 
				+most vector data is held in memory. You should ensure that data nodes have at
			
 
				+least enough RAM to hold the vector data and index structures. To check the
			
 
				+size of the vector data, you can use the <<indices-disk-usage>> API. As a
			
 
				+loose rule of thumb, and assuming the default HNSW options, the bytes used will
			
 
				+be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
			
 
				+the space required will be closer to  `num_vectors * (num_dimensions + 12)`. Note that
			
 
				+the required RAM is for the filesystem cache, which is separate from the Java
			
 
				+heap.
			
 
				+
			
 
				+The data nodes should also leave a buffer for other ways that RAM is needed.
			
 
				+For example your index might also include text fields and numerics, which also
			
 
				+benefit from using filesystem cache. It's recommended to run benchmarks with
			
 
				+your specific dataset to ensure there's a sufficient amount of memory to give
			
 
				+good search performance.
			
 
				+You can find https://elasticsearch-benchmarks.elastic.co/#tracks/so_vector[here]
			
 
				+and https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector[here] some examples
			
 
				+of datasets and configurations that we use for our nightly benchmarks.
			
 
				+
			
 
				+[discrete]
			
 
				+include::search-speed.asciidoc[tag=warm-fs-cache]
			
 
				+
			
 
				+The following file extensions are used for the approximate kNN search:
			
 
				+
			
 
				+* `vec` and `veq` for vector values
			
 
				+* `vex` for HNSW graph
			
 
				+* `vem`, `vemf`, and `vemq` for metadata
			
 
				 
			
 
				 [discrete]
			
 
				 === Reduce the number of index segments