|
@@ -10,20 +10,70 @@
|
|
|
|
|
|
### ActionListener
|
|
|
|
|
|
-`ActionListener`s are a means off injecting logic into lower layers of the code. They encapsulate a block of code that takes a response
|
|
|
-value -- the `onResponse()` method --, and then that block of code (the `ActionListener`) is passed into a function that will eventually
|
|
|
-execute the code (call `onResponse()`) when a response value is available. `ActionListener`s are used to pass code down to act on a result,
|
|
|
-rather than lower layers returning a result back up to be acted upon by the caller. One of three things can happen to a listener: it can be
|
|
|
-executed in the same thread — e.g. `ActionListener.run()` --; it can be passed off to another thread to be executed; or it can be added to
|
|
|
-a list someplace, to eventually be executed by some service. `ActionListener`s also define `onFailure()` logic, in case an error is
|
|
|
-encountered before a result can be formed.
|
|
|
+Callbacks are used extensively throughout Elasticsearch because they enable us to write asynchronous and nonblocking code, i.e. code which
|
|
|
+doesn't necessarily compute a result straight away but also doesn't block the calling thread waiting for the result to become available.
|
|
|
+They support several useful control flows:
|
|
|
+
|
|
|
+- They can be completed immediately on the calling thread.
|
|
|
+- They can be completed concurrently on a different thread.
|
|
|
+- They can be stored in a data structure and completed later on when the system reaches a particular state.
|
|
|
+- Most commonly, they can be passed on to other methods that themselves require a callback.
|
|
|
+- They can be wrapped in another callback which modifies the behaviour of the original callback, perhaps adding some extra code to run
|
|
|
+ before or after completion, before passing them on.
|
|
|
+
|
|
|
+`ActionListener` is a general-purpose callback interface that is used extensively across the Elasticsearch codebase. `ActionListener` is
|
|
|
+used pretty much everywhere that needs to perform some asynchronous and nonblocking computation. The uniformity makes it easier to compose
|
|
|
+parts of the system together without needing to build adapters to convert back and forth between different kinds of callback. It also makes
|
|
|
+it easier to develop the skills needed to read and understand all the asynchronous code, although this definitely takes practice and is
|
|
|
+certainly not easy in an absolute sense. Finally, it has allowed us to build a rich library for working with `ActionListener` instances
|
|
|
+themselves, creating new instances out of existing ones and completing them in interesting ways. See for instance:
|
|
|
+
|
|
|
+- all the static methods on [ActionListener](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/ActionListener.java) itself
|
|
|
+- [`ThreadedActionListener`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/ThreadedActionListener.java) for forking work elsewhere
|
|
|
+- [`RefCountingListener`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/RefCountingListener.java) for running work in parallel
|
|
|
+- [`SubscribableListener`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/SubscribableListener.java) for constructing flexible workflows
|
|
|
+
|
|
|
+Callback-based asynchronous code can easily call regular synchronous code, but synchronous code cannot run callback-based asynchronous code
|
|
|
+without blocking the calling thread until the callback is called back. This blocking is at best undesirable (threads are too expensive to
|
|
|
+waste with unnecessary blocking) and at worst outright broken (the blocking can lead to deadlock). Unfortunately this means that most of our
|
|
|
+code ends up having to be written with callbacks, simply because it's ultimately calling into some other code that takes a callback. The
|
|
|
+entry points for all Elasticsearch APIs are callback-based (e.g. REST APIs all start at
|
|
|
+[`org.elasticsearch.rest.BaseRestHandler#prepareRequest`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/rest/BaseRestHandler.java#L158-L171),
|
|
|
+and transport APIs all start at
|
|
|
+[`org.elasticsearch.action.support.TransportAction#doExecute`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/TransportAction.java#L65))
|
|
|
+and the whole system fundamentally works in terms of an event loop (a `io.netty.channel.EventLoop`) which processes network events via
|
|
|
+callbacks.
|
|
|
+
|
|
|
+`ActionListener` is not an _ad-hoc_ invention. Formally speaking, it is our implementation of the general concept of a continuation in the
|
|
|
+sense of [_continuation-passing style_](https://en.wikipedia.org/wiki/Continuation-passing_style) (CPS): an extra argument to a function
|
|
|
+which defines how to continue the computation when the result is available. This is in contrast to _direct style_ which is the more usual
|
|
|
+style of calling methods that return values directly back to the caller so they can continue executing as normal. There's essentially two
|
|
|
+ways that computation can continue in Java (it can return a value or it can throw an exception) which is why `ActionListener` has both an
|
|
|
+`onResponse()` and an `onFailure()` method.
|
|
|
+
|
|
|
+CPS is strictly more expressive than direct style: direct code can be mechanically translated into continuation-passing style, but CPS also
|
|
|
+enables all sorts of other useful control structures such as forking work onto separate threads, possibly to be executed in parallel,
|
|
|
+perhaps even across multiple nodes, or possibly collecting a list of continuations all waiting for the same condition to be satisfied before
|
|
|
+proceeding (e.g.
|
|
|
+[`SubscribableListener`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/SubscribableListener.java)
|
|
|
+amongst many others). Some languages have first-class support for continuations (e.g. the `async` and `await` primitives in C#) allowing the
|
|
|
+programmer to write code in direct style away from those exotic control structures, but Java does not. That's why we have to manipulate all
|
|
|
+the callbacks ourselves.
|
|
|
+
|
|
|
+Strictly speaking, CPS requires that a computation _only_ continues by calling the continuation. In Elasticsearch, this means that
|
|
|
+asynchronous methods must have `void` return type and may not throw any exceptions. This is mostly the case in our code as written today,
|
|
|
+and is a good guiding principle, but we don't enforce void exceptionless methods and there are some deviations from this rule. In
|
|
|
+particular, it's not uncommon to permit some methods to throw an exception, using things like
|
|
|
+[`ActionListener#run`](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/ActionListener.java#L381-L390)
|
|
|
+(or an equivalent `try ... catch ...` block) further up the stack to handle it. Some methods also take (and may complete) an
|
|
|
+`ActionListener` parameter, but still return a value separately for other local synchronous work.
|
|
|
|
|
|
This pattern is often used in the transport action layer with the use of the
|
|
|
-[ChannelActionListener]([url](https://github.com/elastic/elasticsearch/blob/8.12/server/src/main/java/org/elasticsearch/action/support/ChannelActionListener.java))
|
|
|
+[ChannelActionListener](https://github.com/elastic/elasticsearch/blob/v8.12.2/server/src/main/java/org/elasticsearch/action/support/ChannelActionListener.java)
|
|
|
class, which wraps a `TransportChannel` produced by the transport layer. `TransportChannel` implementations can hold a reference to a Netty
|
|
|
-channel with which to pass the response back to the network caller. Netty has a many-to-one association of network callers to channels, so
|
|
|
-a call taking a long time generally won't hog resources: it's cheap. A transport action can take hours to respond and that's alright,
|
|
|
-barring caller timeouts.
|
|
|
+channel with which to pass the response back to the network caller. Netty has a many-to-one association of network callers to channels, so a
|
|
|
+call taking a long time generally won't hog resources: it's cheap. A transport action can take hours to respond and that's alright, barring
|
|
|
+caller timeouts.
|
|
|
|
|
|
(TODO: add useful starter references and explanations for a range of Listener classes. Reference the Netty section.)
|
|
|
|