knn-search.asciidoc 4.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143
  1. [[knn-search-api]]
  2. === kNN search API
  3. ++++
  4. <titleabbrev>kNN search</titleabbrev>
  5. ++++
  6. experimental::[]
  7. Performs a k-nearest neighbor (kNN) search and returns the matching documents.
  8. ////
  9. [source,console]
  10. ----
  11. PUT my-index
  12. {
  13. "mappings": {
  14. "properties": {
  15. "image_vector": {
  16. "type": "dense_vector",
  17. "dims": 3,
  18. "index": true,
  19. "similarity": "l2_norm"
  20. }
  21. }
  22. }
  23. }
  24. PUT my-index/_doc/1?refresh
  25. {
  26. "image_vector" : [0.5, 10, 6]
  27. }
  28. ----
  29. ////
  30. [source,console]
  31. ----
  32. GET my-index/_knn_search
  33. {
  34. "knn": {
  35. "field": "image_vector",
  36. "query_vector": [0.3, 0.1, 1.2],
  37. "k": 10,
  38. "num_candidates": 100
  39. },
  40. "_source": ["name", "date"]
  41. }
  42. ----
  43. // TEST[continued]
  44. [[knn-search-api-request]]
  45. ==== {api-request-title}
  46. `GET <target>/_knn_search`
  47. `POST <target>/_knn_search`
  48. [[knn-search-api-prereqs]]
  49. ==== {api-prereq-title}
  50. * If the {es} {security-features} are enabled, you must have the `read`
  51. <<privileges-list-indices,index privilege>> for the target data stream, index,
  52. or alias.
  53. [[knn-search-api-desc]]
  54. ==== {api-description-title}
  55. The kNN search API performs a k-nearest neighbor (kNN) search on a
  56. <<dense-vector,`dense_vector`>> field. Given a query vector, it finds the _k_
  57. closest vectors and returns those documents as search hits.
  58. //tag::hnsw-algorithm[]
  59. {es} uses the https://arxiv.org/abs/1603.09320[HNSW algorithm] to support
  60. efficient kNN search. Like most kNN algorithms, HNSW is an approximate method
  61. that sacrifices result accuracy for improved search speed. This means the
  62. results returned are not always the true _k_ closest neighbors.
  63. //end::hnsw-algorithm[]
  64. [[knn-search-api-path-params]]
  65. ==== {api-path-parms-title}
  66. `<target>`::
  67. (Optional, string) Comma-separated list of data streams, indices, and aliases
  68. to search. Supports wildcards (`*`). To search all data streams and indices,
  69. use `*` or `_all`.
  70. WARNING: kNN search does not yet work with <<filter-alias,filtered aliases>>.
  71. Running a kNN search against a filtered alias may incorrectly result in fewer
  72. than _k_ hits.
  73. [role="child_attributes"]
  74. [[knn-search-api-query-params]]
  75. ==== {api-query-parms-title}
  76. include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=routing]
  77. [role="child_attributes"]
  78. [[knn-search-api-request-body]]
  79. ==== {api-request-body-title}
  80. `knn`::
  81. (Required, object) Defines the kNN query to run.
  82. +
  83. .Properties of `knn` object
  84. [%collapsible%open]
  85. ====
  86. `field`::
  87. (Required, string) The name of the vector field to search against. Must be a
  88. <<index-vectors-knn-search, `dense_vector` field with indexing enabled>>.
  89. `query_vector`::
  90. (Required, array of floats) Query vector. Must have the same number of
  91. dimensions as the vector field you are searching against.
  92. `k`::
  93. (Required, integer) Number of nearest neighbors to return as top hits. This
  94. value must be less than `num_candidates`.
  95. `num_candidates`::
  96. (Required, integer) The number of nearest neighbor candidates to consider per
  97. shard. Cannot exceed 10,000. {es} collects `num_candidates` results from each
  98. shard, then merges them to find the top `k` results. Increasing
  99. `num_candidates` tends to improve the accuracy of the final `k` results.
  100. ====
  101. include::{es-repo-dir}/search/search.asciidoc[tag=docvalue-fields-def]
  102. include::{es-repo-dir}/search/search.asciidoc[tag=fields-param-def]
  103. include::{es-repo-dir}/search/search.asciidoc[tag=source-filtering-def]
  104. include::{es-repo-dir}/search/search.asciidoc[tag=stored-fields-def]
  105. [role="child_attributes"]
  106. [[knn-search-api-response-body]]
  107. ==== {api-response-body-title}
  108. A kNN search response has the exact same structure as a
  109. <<search-api-response-body, search API response>>. However, certain sections
  110. have a meaning specific to kNN search:
  111. * The <<search-api-response-body-score,document `_score`>> is determined by
  112. the similarity between the query and document vector. See
  113. <<dense-vector-similarity, `similarity`>>.
  114. * The `hits.total` object contains the total number of nearest neighbor
  115. candidates considered, which is `num_candidates * num_shards`. The
  116. `hits.total.relation` will always be `eq`, indicating an exact value.