point-in-time-api.asciidoc 4.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120
  1. [role="xpack"]
  2. [testenv="basic"]
  3. [[point-in-time-api]]
  4. === Point in time API
  5. ++++
  6. <titleabbrev>Point in time</titleabbrev>
  7. ++++
  8. A search request by default executes against the most recent visible data of
  9. the target indices, which is called point in time. Elasticsearch pit (point in time)
  10. is a lightweight view into the state of the data as it existed when initiated.
  11. In some cases, it's preferred to perform multiple search requests using
  12. the same point in time. For example, if <<indices-refresh,refreshes>> happen between
  13. search_after requests, then the results of those requests might not be consistent as
  14. changes happening between searches are only visible to the more recent point in time.
  15. A point in time must be opened explicitly before being used in search requests. The
  16. keep_alive parameter tells Elasticsearch how long it should keep a point in time alive,
  17. e.g. `?keep_alive=5m`.
  18. [source,console]
  19. --------------------------------------------------
  20. POST /my-index-000001/_pit?keep_alive=1m
  21. --------------------------------------------------
  22. // TEST[setup:my_index]
  23. The result from the above request includes a `id`, which should
  24. be passed to the `id` of the `pit` parameter of a search request.
  25. [source,console]
  26. --------------------------------------------------
  27. POST /_search <1>
  28. {
  29. "size": 100,
  30. "query": {
  31. "match" : {
  32. "title" : "elasticsearch"
  33. }
  34. },
  35. "pit": {
  36. "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", <2>
  37. "keep_alive": "1m" <3>
  38. }
  39. }
  40. --------------------------------------------------
  41. // TEST[catch:missing]
  42. <1> A search request with the `pit` parameter must not specify `index`, `routing`,
  43. and {ref}/search-request-body.html#request-body-search-preference[`preference`]
  44. as these parameters are copied from the point in time.
  45. <2> The `id` parameter tells Elasticsearch to execute the request using contexts
  46. from this point int time.
  47. <3> The `keep_alive` parameter tells Elasticsearch how long it should extend
  48. the time to live of the point in time.
  49. IMPORTANT: The open point in time request and each subsequent search request can
  50. return different `id`; thus always use the most recently received `id` for the
  51. next search request.
  52. [[point-in-time-keep-alive]]
  53. ==== Keeping point in time alive
  54. The `keep_alive` parameter, which is passed to a open point in time request and
  55. search request, extends the time to live of the corresponding point in time.
  56. The value (e.g. `1m`, see <<time-units>>) does not need to be long enough to
  57. process all data -- it just needs to be long enough for the next request.
  58. Normally, the background merge process optimizes the index by merging together
  59. smaller segments to create new, bigger segments. Once the smaller segments are
  60. no longer needed they are deleted. However, open point-in-times prevent the
  61. old segments from being deleted since they are still in use.
  62. TIP: Keeping older segments alive means that more disk space and file handles
  63. are needed. Ensure that you have configured your nodes to have ample free file
  64. handles. See <<file-descriptors>>.
  65. Additionally, if a segment contains deleted or updated documents then the
  66. point in time must keep track of whether each document in the segment was live at
  67. the time of the initial search request. Ensure that your nodes have sufficient heap
  68. space if you have many open point-in-times on an index that is subject to ongoing
  69. deletes or updates.
  70. You can check how many point-in-times (i.e, search contexts) are open with the
  71. <<cluster-nodes-stats,nodes stats API>>:
  72. [source,console]
  73. ---------------------------------------
  74. GET /_nodes/stats/indices/search
  75. ---------------------------------------
  76. [[close-point-in-time-api]]
  77. ==== Close point in time API
  78. Point-in-time is automatically closed when its `keep_alive` has
  79. been elapsed. However keeping point-in-times has a cost, as discussed in the
  80. <<point-in-time-keep-alive,previous section>>. Point-in-times should be closed
  81. as soon as they are no longer used in search requests.
  82. [source,console]
  83. ---------------------------------------
  84. DELETE /_pit
  85. {
  86. "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
  87. }
  88. ---------------------------------------
  89. // TEST[catch:missing]
  90. The API returns the following response:
  91. [source,console-result]
  92. --------------------------------------------------
  93. {
  94. "succeeded": true, <1>
  95. "num_freed": 3 <2>
  96. }
  97. --------------------------------------------------
  98. // TESTRESPONSE[s/"succeeded": true/"succeeded": $body.succeeded/]
  99. // TESTRESPONSE[s/"num_freed": 3/"num_freed": $body.num_freed/]
  100. <1> If true, all search contexts associated with the point-in-time id are successfully closed
  101. <2> The number of search contexts have been successfully closed