delete-by-query.asciidoc 5.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163
  1. [[plugins-delete-by-query]]
  2. == Delete By Query Plugin
  3. The delete by query plugin adds support for deleting all of the documents
  4. (from one or more indices) which match the specified query. It is a
  5. replacement for the problematic _delete-by-query_ functionality which has been
  6. removed from Elasticsearch core.
  7. Internally, it uses the <<scroll-scan, Scan/Scroll>> and <<docs-bulk, Bulk>>
  8. APIs to delete documents in an efficient and safe manner. It is slower than
  9. the old _delete-by-query_ functionality, but fixes the problems with the
  10. previous implementation.
  11. TIP: Queries which match large numbers of documents may run for a long time,
  12. as every document has to be deleted individually. Don't use _delete-by-query_
  13. to clean out all or most documents in an index. Rather create a new index and
  14. perhaps reindex the documents you want to keep.
  15. === Installation
  16. This plugin can be installed using the plugin manager:
  17. [source,sh]
  18. ----------------------------------------------------------------
  19. bin/plugin install elasticsearch/elasticsearch-delete-by-query
  20. ----------------------------------------------------------------
  21. The plugin must be installed on every node in the cluster, and each node must
  22. be restarted after installation.
  23. === Removal
  24. The plugin can be removed with the following command:
  25. [source,sh]
  26. ----------------------------------------------------------------
  27. bin/plugin remove elasticsearch/elasticsearch-delete-by-query
  28. ----------------------------------------------------------------
  29. The node must be stopped before removing the plugin.
  30. === Usage
  31. The query can either be provided using a simple query string as
  32. a parameter:
  33. [source,shell]
  34. --------------------------------------------------
  35. curl -XDELETE 'http://localhost:9200/twitter/tweet/_query?q=user:kimchy'
  36. --------------------------------------------------
  37. or using the <<query-dsl,Query DSL>> defined within the request body:
  38. [source,js]
  39. --------------------------------------------------
  40. curl -XDELETE 'http://localhost:9200/twitter/tweet/_query' -d '{
  41. "query" : { <1>
  42. "term" : { "user" : "kimchy" }
  43. }
  44. }
  45. '
  46. --------------------------------------------------
  47. <1> The query must be passed as a value to the `query` key, in the same way as
  48. the <<search-search,search api>>.
  49. Both of the above examples end up doing the same thing, which is to delete all
  50. tweets from the twitter index for the user `kimchy`.
  51. Delete-by-query supports deletion across <<search-multi-index-type,multiple indices and multiple types>>.
  52. ==== Query-string parameters
  53. The following query string parameters are supported:
  54. `q`::
  55. Instead of using the <<query-dsl,Query DSL>> to pass a `query` in the request
  56. body, you can use the `q` query string parameter to specify a query using
  57. <<query-string-syntax,`query_string` syntax>>. In this case, the following
  58. additional parameters are supported: `df`, `analyzer`, `default_operator`,
  59. `lowercase_expanded_terms`, `analyze_wildcard` and `lenient`.
  60. See <<search-uri-request>> for details.
  61. `size`::
  62. The number of hits returned *per shard* by the <<scroll-scan,scroll/scan>>
  63. request. Defaults to 10. May also be specified in the request body.
  64. `timeout`::
  65. The maximum execution time of the delete by query process. Once expired, no
  66. more documents will be deleted.
  67. `routing`::
  68. A comma separated list of routing values to control which shards the delete by
  69. query request should be executed on.
  70. When using the `q` parameter, the following additional parameters are
  71. supported (as explained in <<search-uri-request>>): `df`, `analyzer`,
  72. `default_operator`.
  73. ==== Response body
  74. The JSON response looks like this:
  75. [source,js]
  76. --------------------------------------------------
  77. {
  78. "took" : 639,
  79. "timed_out" : false,
  80. "_indices" : {
  81. "_all" : {
  82. "found" : 5901,
  83. "deleted" : 5901,
  84. "missing" : 0,
  85. "failed" : 0
  86. },
  87. "twitter" : {
  88. "found" : 5901,
  89. "deleted" : 5901,
  90. "missing" : 0,
  91. "failed" : 0
  92. }
  93. },
  94. "failures" : [ ]
  95. }
  96. --------------------------------------------------
  97. Internally, the query is used to execute an initial
  98. <<scroll-scan,scroll/scan>> request. As hits are pulled from the scroll API,
  99. they are passed to the <<bulk,Bulk API>> for deletion.
  100. IMPORTANT: Delete by query will only delete the version of the document that
  101. was visible to search at the time the request was executed. Any documents
  102. that have been reindexed or updated during execution will not be deleted.
  103. Since documents can be updated or deleted by external operations during the
  104. _scan-scroll-bulk_ process, the plugin keeps track of different counters for
  105. each index, with the totals displayed under the `_all` index. The counters
  106. are as follows:
  107. `found`::
  108. The number of documents matching the query for the given index.
  109. `deleted`::
  110. The number of documents successfully deleted for the given index.
  111. `missing`::
  112. The number of documents that were missing when the plugin tried to delete
  113. them. Missing documents were present when the original query was run, but have
  114. already been deleted by another process.
  115. `failed`::
  116. The number of documents that failed to be deleted for the given index. A
  117. document may fail to be deleted if it has been updated to a new version by
  118. another process, or if the shard containing the document has gone missing due
  119. to hardware failure, for example.