123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149 |
- [role="xpack"]
- [testenv="basic"]
- [[index-lifecycle-error-handling]]
- == Resolve lifecycle policy execution errors
- When {ilm-init} executes a lifecycle policy, it's possible for errors to occur
- while performing the necessary index operations for a step.
- When this happens, {ilm-init} moves the index to an `ERROR` step.
- If {ilm-init] cannot resolve the error automatically, execution is halted
- until you resolve the underlying issues with the policy, index, or cluster.
- For example, you might have a `shrink-index` policy that shrinks an index to four shards once it
- is at least five days old:
- [source,console]
- --------------------------------------------------
- PUT _ilm/policy/shrink-index
- {
- "policy": {
- "phases": {
- "warm": {
- "min_age": "5d",
- "actions": {
- "shrink": {
- "number_of_shards": 4
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST
- There is nothing that prevents you from applying the `shrink-index` policy to a new
- index that has only two shards:
- [source,console]
- --------------------------------------------------
- PUT /myindex
- {
- "settings": {
- "index.number_of_shards": 2,
- "index.lifecycle.name": "shrink-index"
- }
- }
- --------------------------------------------------
- // TEST[continued]
- After five days, {ilm-init} attempts to shrink `myindex` from two shards to four shards.
- Because the shrink action cannot _increase_ the number of shards, this operation fails
- and {ilm-init} moves `myindex` to the `ERROR` step.
- You can use the <<ilm-explain-lifecycle,{ilm-init} Explain API>> to get information about
- what went wrong:
- [source,console]
- --------------------------------------------------
- GET /myindex/_ilm/explain
- --------------------------------------------------
- // TEST[continued]
- Which returns the following information:
- [source,console-result]
- --------------------------------------------------
- {
- "indices" : {
- "myindex" : {
- "index" : "myindex",
- "managed" : true,
- "policy" : "shrink-index", <1>
- "lifecycle_date_millis" : 1541717265865,
- "age": "5.1d", <2>
- "phase" : "warm", <3>
- "phase_time_millis" : 1541717272601,
- "action" : "shrink", <4>
- "action_time_millis" : 1541717272601,
- "step" : "ERROR", <5>
- "step_time_millis" : 1541717272688,
- "failed_step" : "shrink", <6>
- "step_info" : {
- "type" : "illegal_argument_exception", <7>
- "reason" : "the number of target shards [4] must be less that the number of source shards [2]"
- },
- "phase_execution" : {
- "policy" : "shrink-index",
- "phase_definition" : { <8>
- "min_age" : "5d",
- "actions" : {
- "shrink" : {
- "number_of_shards" : 4
- }
- }
- },
- "version" : 1,
- "modified_date_in_millis" : 1541717264230
- }
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[skip:no way to know if we will get this response immediately]
- <1> The policy being used to manage the index: `shrink-index`
- <2> The index age: 5.1 days
- <3> The phase the index is currently in: `warm`
- <4> The current action: `shrink`
- <5> The step the index is currently in: `ERROR`
- <6> The step that failed to execute: `shrink`
- <7> The type of error and a description of that error.
- <8> The definition of the current phase from the `shrink-index` policy
- To resolve this, you could update the policy to shrink the index to a single shard after 5 days:
- [source,console]
- --------------------------------------------------
- PUT _ilm/policy/shrink-index
- {
- "policy": {
- "phases": {
- "warm": {
- "min_age": "5d",
- "actions": {
- "shrink": {
- "number_of_shards": 1
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- [discrete]
- === Retrying failed lifecycle policy steps
- Once you fix the problem that put an index in the `ERROR` step,
- you might need to explicitly tell {ilm-init} to retry the step:
- [source,console]
- --------------------------------------------------
- POST /myindex/_ilm/retry
- --------------------------------------------------
- // TEST[skip:we can't be sure the index is ready to be retried at this point]
- {ilm-init} subsequently attempts to re-run the step that failed.
- You can use the <<ilm-explain-lifecycle,{ilm-init} Explain API>> to monitor the progress.
|