how-watcher-works.asciidoc 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443
  1. [role="xpack"]
  2. [[how-watcher-works]]
  3. == How {watcher} works
  4. You <<watch-definition, add watches>> to automatically perform an action when
  5. certain conditions are met. The conditions are generally based on data you've
  6. loaded into the watch, also known as the _Watch Payload_. This payload can be
  7. loaded from different sources - from Elasticsearch, an external HTTP service, or
  8. even a combination of the two.
  9. For example, you could configure a watch to send an email to the sysadmin when a
  10. search in the logs data indicates that there are too many 503 errors in the last
  11. 5 minutes.
  12. This topic describes the elements of a watch and how watches operate.
  13. [discrete]
  14. [[watch-definition]]
  15. === Watch definition
  16. A watch consists of a _trigger_, _input_, _condition_, and _actions_. The actions
  17. define what needs to be done once the condition is met. In addition, you can
  18. define _conditions_ and _transforms_ to process and prepare the watch payload before
  19. executing the actions.
  20. <<trigger,Trigger>>::
  21. Determines when the watch is checked. A watch must have a trigger.
  22. <<input,Input>>::
  23. Loads data into the watch payload. If no input is specified, an empty payload is
  24. loaded.
  25. <<condition,Condition>>::
  26. Controls whether the watch actions are executed. If no condition is specified,
  27. the condition defaults to `always`.
  28. <<transform,Transform>>::
  29. Processes the watch payload to prepare it for the watch actions. You can define
  30. transforms at the watch level or define action-specific transforms. Optional.
  31. <<actions,Actions>>::
  32. Specify what happens when the watch condition is met.
  33. [[watch-definition-example]]
  34. For example, the following snippet shows a <<watcher-api-put-watch,create or
  35. update watch>> request that defines a watch that looks for log error events:
  36. [source,console]
  37. --------------------------------------------------
  38. PUT _watcher/watch/log_errors
  39. {
  40. "metadata" : { <1>
  41. "color" : "red"
  42. },
  43. "trigger" : { <2>
  44. "schedule" : {
  45. "interval" : "5m"
  46. }
  47. },
  48. "input" : { <3>
  49. "search" : {
  50. "request" : {
  51. "indices" : "log-events",
  52. "body" : {
  53. "size" : 0,
  54. "query" : { "match" : { "status" : "error" } }
  55. }
  56. }
  57. }
  58. },
  59. "condition" : { <4>
  60. "compare" : { "ctx.payload.hits.total" : { "gt" : 5 }}
  61. },
  62. "transform" : { <5>
  63. "search" : {
  64. "request" : {
  65. "indices" : "log-events",
  66. "body" : {
  67. "query" : { "match" : { "status" : "error" } }
  68. }
  69. }
  70. }
  71. },
  72. "actions" : { <6>
  73. "my_webhook" : {
  74. "webhook" : {
  75. "method" : "POST",
  76. "host" : "mylisteninghost",
  77. "port" : 9200,
  78. "path" : "/{{watch_id}}",
  79. "body" : "Encountered {{ctx.payload.hits.total}} errors"
  80. }
  81. },
  82. "email_administrator" : {
  83. "email" : {
  84. "to" : "sys.admino@host.domain",
  85. "subject" : "Encountered {{ctx.payload.hits.total}} errors",
  86. "body" : "Too many error in the system, see attached data",
  87. "attachments" : {
  88. "attached_data" : {
  89. "data" : {
  90. "format" : "json"
  91. }
  92. }
  93. },
  94. "priority" : "high"
  95. }
  96. }
  97. }
  98. }
  99. --------------------------------------------------
  100. <1> Metadata - You can attach optional static metadata to a watch.
  101. <2> Trigger - This schedule trigger executes the watch every 5 minutes.
  102. <3> Input - This input searches for errors in the `log-events` index and
  103. loads the response into the watch payload.
  104. <4> Condition - This condition checks to see if there are more than 5 error
  105. events (hits in the search response). If there are, execution
  106. continues for all `actions`.
  107. <5> Transform - If the watch condition is met, this transform loads all of the
  108. errors into the watch payload by searching for the errors using
  109. the default search type, `query_then_fetch`. All of the watch
  110. actions have access to this payload.
  111. <6> Actions - This watch has two actions. The `my_webhook` action notifies a
  112. 3rd party system about the problem. The `email_administrator`
  113. action sends a high priority email to the system administrator.
  114. The watch payload that contains the errors is attached to the
  115. email.
  116. [discrete]
  117. [[watch-execution]]
  118. === Watch execution
  119. [[schedule-scheduler]]
  120. When you add a watch, {watcher} immediately registers its trigger with the
  121. appropriate trigger engine. Watches that have a `schedule` trigger are
  122. registered with the `scheduler` trigger engine.
  123. The scheduler tracks time and triggers watches according to their schedules.
  124. On each node, that contains one of the `.watches` shards, a scheduler, that is
  125. bound to the watcher lifecycle runs. Even though all primaries and replicas are
  126. taken into account, when a watch is triggered, watcher also ensures, that each
  127. watch is only triggered on one of those shards. The more replica shards you
  128. add, the more distributed the watches can be executed. If you add or remove
  129. replicas, all watches need to be reloaded. If a shard is relocated, the
  130. primary and all replicas of this particular shard will reload.
  131. Because the watches are executed on the node, where the watch shards are, you can create
  132. dedicated watcher nodes by using shard allocation filtering.
  133. You could configure nodes with a dedicated `node.attr.role: watcher` property and
  134. then configure the `.watches` index like this:
  135. [source,console]
  136. ------------------------
  137. PUT .watches/_settings
  138. {
  139. "index.routing.allocation.include.role": "watcher"
  140. }
  141. ------------------------
  142. // TEST[skip:indexes don't assign]
  143. When the {watcher} service is stopped, the scheduler stops with it. Trigger
  144. engines use a separate thread pool from the one used to execute watches.
  145. When a watch is triggered, {watcher} queues it up for execution. A `watch_record`
  146. document is created and added to the watch history and the watch's status is set
  147. to `awaits_execution`.
  148. When execution starts, {watcher} creates a watch execution context for the watch.
  149. The execution context provides scripts and templates with access to the watch
  150. metadata, payload, watch ID, execution time, and trigger information. For more
  151. information, see <<watch-execution-context, Watch Execution Context>>.
  152. During the execution process, {watcher}:
  153. . Loads the input data as the payload in the watch execution context. This makes
  154. the data available to all subsequent steps in the execution process. This step
  155. is controlled by the input of the watch.
  156. . Evaluates the watch condition to determine whether or not to continue processing
  157. the watch. If the condition is met (evaluates to `true`), processing advances
  158. to the next step. If it is not met (evaluates to `false`), execution of the watch
  159. stops.
  160. . Applies transforms to the watch payload (if needed).
  161. . Executes the watch actions granted the condition is met and the watch is not
  162. <<watch-acknowledgment-throttling, throttled>>.
  163. When the watch execution finishes, the execution result is recorded as a
  164. _Watch Record_ in the watch history. The watch record includes the execution
  165. time and duration, whether the watch condition was met, and the status of each
  166. action that was executed.
  167. The following diagram shows the watch execution process:
  168. image::images/watch-execution.jpg[align="center"]
  169. [discrete]
  170. [[watch-acknowledgment-throttling]]
  171. === Watch acknowledgment and throttling
  172. {watcher} supports both time-based and acknowledgment-based throttling. This
  173. enables you to prevent actions from being repeatedly executed for the same event.
  174. By default, {watcher} uses time-based throttling with a throttle period of 5
  175. seconds. This means that if a watch is executed every second, its actions are
  176. performed a maximum of once every 5 seconds, even when the condition is always
  177. met. You can configure the throttle period on a per-action basis or at the
  178. watch level.
  179. Acknowledgment-based throttling enables you to tell {watcher} not to send any more
  180. notifications about a watch as long as its condition is met. Once the condition
  181. evaluates to `false`, the acknowledgment is cleared and {watcher} resumes executing
  182. the watch actions normally.
  183. For more information, see <<actions-ack-throttle>>.
  184. [discrete]
  185. [[watch-active-state]]
  186. === Watch active state
  187. By default, when you add a watch it is immediately set to the _active_ state,
  188. registered with the appropriate trigger engine, and executed according
  189. to its configured trigger.
  190. You can also set a watch to the _inactive_ state. Inactive watches are not
  191. registered with a trigger engine and can never be triggered.
  192. To set a watch to the inactive state when you create it, set the
  193. <<watcher-api-put-watch,`active`>> parameter to _inactive_. To
  194. deactivate an existing watch, use the
  195. <<watcher-api-deactivate-watch,deactivate watch API>>. To reactivate an
  196. inactive watch, use the
  197. <<watcher-api-activate-watch,activate watch API>>.
  198. NOTE: You can use the <<watcher-api-execute-watch,execute watch API>>
  199. to force the execution of a watch even when it is inactive.
  200. Deactivating watches is useful in a variety of situations. For example, if you
  201. have a watch that monitors an external system and you need to take that system
  202. down for maintenance, you can deactivate the watch to prevent it from falsely
  203. reporting availability issues during the maintenance window.
  204. Deactivating a watch also enables you to keep it around for future use without
  205. deleting it from the system.
  206. [discrete]
  207. [[scripts-templates]]
  208. === Scripts and templates
  209. You can use scripts and templates when defining a watch. Scripts and templates
  210. can reference elements in the watch execution context, including the watch payload.
  211. The execution context defines variables you can use in a script and parameter
  212. placeholders in a template.
  213. {watcher} uses the Elasticsearch script infrastructure, which supports
  214. <<inline-templates-scripts,inline>> and <<stored-templates-scripts, stored>>.
  215. Scripts and templates are compiled
  216. and cached by Elasticsearch to optimize recurring execution. Autoloading is also
  217. supported. For more information, see <<modules-scripting>> and
  218. <<modules-scripting-using>>.
  219. [discrete]
  220. [[watch-execution-context]]
  221. ==== Watch execution context
  222. The following snippet shows the basic structure of the _Watch Execution Context_:
  223. [source,js]
  224. ----------------------------------------------------------------------
  225. {
  226. "ctx" : {
  227. "metadata" : { ... }, <1>
  228. "payload" : { ... }, <2>
  229. "watch_id" : "<id>", <3>
  230. "execution_time" : "20150220T00:00:10Z", <4>
  231. "trigger" : { <5>
  232. "triggered_time" : "20150220T00:00:10Z",
  233. "scheduled_time" : "20150220T00:00:00Z"
  234. },
  235. "vars" : { ... } <6>
  236. }
  237. ----------------------------------------------------------------------
  238. // NOTCONSOLE
  239. <1> Any static metadata specified in the watch definition.
  240. <2> The current watch payload.
  241. <3> The id of the executing watch.
  242. <4> A timestamp that shows when the watch execution started.
  243. <5> Information about the trigger event. For a `schedule` trigger, this
  244. consists of the `triggered_time` (when the watch was triggered)
  245. and the `scheduled_time` (when the watch was scheduled to be triggered).
  246. <6> Dynamic variables that can be set and accessed by different constructs
  247. during the execution. These variables are scoped to a single execution
  248. (i.e they're not persisted and can't be used between different executions
  249. of the same watch)
  250. [discrete]
  251. [[scripts]]
  252. ==== Using scripts
  253. You can use scripts to define <<condition-script,conditions>> and
  254. <<transform-script,transforms>>. The default scripting language is
  255. <<modules-scripting-painless,Painless>>.
  256. NOTE: Starting with 5.0, Elasticsearch is shipped with the new
  257. <<modules-scripting-painless,Painless>> scripting language.
  258. Painless was created and designed specifically for use in Elasticsearch.
  259. Beyond providing an extensive feature set, its biggest trait is that it's
  260. properly sandboxed and safe to use anywhere in the system (including in
  261. {watcher}) without the need to enable dynamic scripting.
  262. Scripts can reference any of the values in the watch execution context or values
  263. explicitly passed through script parameters.
  264. For example, if the watch metadata contains a `color` field
  265. (e.g. `"metadata" : {"color": "red"}`), you can access its value with the via the
  266. `ctx.metadata.color` variable. If you pass in a `color` parameter as part of the
  267. condition or transform definition (e.g. `"params" : {"color": "red"}`), you can
  268. access its value via the `color` variable.
  269. [discrete]
  270. [[templates]]
  271. ==== Using templates
  272. You use templates to define dynamic content for a watch. At execution time,
  273. templates pull in data from the watch execution context. For example, you can use
  274. a template to populate the `subject` field for an `email` action with data stored
  275. in the watch payload. Templates can also access values explicitly passed through
  276. template parameters.
  277. You specify templates using the https://mustache.github.io[Mustache] scripting
  278. language.
  279. For example, the following snippet shows how templates enable dynamic subjects
  280. in sent emails:
  281. [source,js]
  282. ----------------------------------------------------------------------
  283. {
  284. "actions" : {
  285. "email_notification" : {
  286. "email" : {
  287. "subject" : "{{ctx.metadata.color}} alert"
  288. }
  289. }
  290. }
  291. }
  292. ----------------------------------------------------------------------
  293. // NOTCONSOLE
  294. [discrete]
  295. [[inline-templates-scripts]]
  296. ===== Inline templates and scripts
  297. To define an inline template or script, you simply specify it directly in the
  298. value of a field. For example, the following snippet configures the subject of
  299. the `email` action using an inline template that references the `color` value in
  300. the context metadata.
  301. [source,js]
  302. ----------------------------------------------------------------------
  303. "actions" : {
  304. "email_notification" : {
  305. "email" : {
  306. "subject" : "{{ctx.metadata.color}} alert"
  307. }
  308. }
  309. }
  310. }
  311. ----------------------------------------------------------------------
  312. // NOTCONSOLE
  313. For a script, you simply specify the inline script as the value of the `script`
  314. field. For example:
  315. [source,js]
  316. ----------------------------------------------------------------------
  317. "condition" : {
  318. "script" : "return true"
  319. }
  320. ----------------------------------------------------------------------
  321. // NOTCONSOLE
  322. You can also explicitly specify the inline type by using a formal object
  323. definition as the field value. For example:
  324. [source,js]
  325. ----------------------------------------------------------------------
  326. "actions" : {
  327. "email_notification" : {
  328. "email" : {
  329. "subject" : {
  330. "source" : "{{ctx.metadata.color}} alert"
  331. }
  332. }
  333. }
  334. }
  335. ----------------------------------------------------------------------
  336. // NOTCONSOLE
  337. The formal object definition for a script would be:
  338. [source,js]
  339. ----------------------------------------------------------------------
  340. "condition" : {
  341. "script" : {
  342. "source": "return true"
  343. }
  344. }
  345. ----------------------------------------------------------------------
  346. // NOTCONSOLE
  347. [discrete]
  348. [[stored-templates-scripts]]
  349. ===== Stored templates and scripts
  350. If you <<modules-scripting-stored-scripts,store>>
  351. your templates and scripts, you can reference them by id.
  352. To reference a stored script or template, you use the formal object definition
  353. and specify its id in the `id` field. For example, the following snippet
  354. references the `email_notification_subject` template:
  355. [source,js]
  356. ----------------------------------------------------------------------
  357. {
  358. ...
  359. "actions" : {
  360. "email_notification" : {
  361. "email" : {
  362. "subject" : {
  363. "id" : "email_notification_subject",
  364. "params" : {
  365. "color" : "red"
  366. }
  367. }
  368. }
  369. }
  370. }
  371. }
  372. ----------------------------------------------------------------------
  373. // NOTCONSOLE