Monday, December 28, 2015

Reactive Actors

I've been meaning to revisit reactive programming and the actor model for a while now. I first learned about them in the Principles of Reactive Programming Coursera class and then actors came up again in the Seven Concurrency Models in Seven Weeks book. The Scala I picked up is quickly being forgotten and I haven't done a post with code in a while, so here I'll get back into that and create a simple application using Akka and RxScala.

Actor model

The developerWorks article JVM Concurrency: Acting asynchronously with Akka gives a good introduction to the actor model:
The actor model for concurrent computations builds up systems based on primitives called actors. Actors take actions in response to inputs called messages. Actions can include changing the actor's own internal state as well as sending off other messages and even creating other actors. All messages are delivered asynchronously, thereby decoupling message senders from receivers. Because of this decoupling, actor systems are inherently concurrent: Any actors that have input messages available can be executed in parallel, without restriction.
Then JVM Concurrency: Building actor applications with Akka goes on to explain the advantages of this approach:
If you compose your actors and messages correctly, you end up with a system in which most things happen asynchronously. Asynchronous operation is harder to understand than a linear approach, but it pays off in scalability. Highly asynchronous programs are better able to use increased system resources (for example, memory and processors) either to accomplish a particular task more quickly or to handle more instances of the task in parallel. With Akka, you can even extend this scalability across multiple systems, by using remoting to work with distributed actors.
At first the actor model may sound the same as what I described in my Communicating Sequential Processes post because both involve message passing, but the two concurrency models have several differences:
  • Actors have identities while CSPs are anonymous
  • Actors transmit messages to named actors while CSPs transmit messages using channels
  • Actors transmit messages asynchronously while CSPs can't transmit a message until the sender is ready to receive it
My impression, and I could be wrong, is that actors more naturally extend beyond a single machine to a distributed system since the sending and receiving of messages is decoupled. A quick search does turn up distributed channels in pycsp, though, so it seems that both can be distributed.

Reactive applications

The Reactive Manifesto details four qualities of reactive applications:
  • responsive - the system responds in a timely manner if at all possible
  • resilient - the system stays responsive in the face of failure
  • elastic - the system stays responsive under varying workloads
  • message driven - the system relies on asynchronous message passing between components
Where Akka describes itself as a toolkit and runtime, RxScala only claims to be a library for composing asynchronous and event-based programs using observable sequences. To me it's not clear how it helps us achieve all four qualities (or if it even intends to).  Nevertheless, the ReactiveX introduction explains their advantages:
The ReactiveX Observable model allows you to treat streams of asynchronous events with the same sort of simple, composable operations that you use for collections of data items like arrays. It frees you from tangled webs of callbacks, and thereby makes your code more readable and less prone to bugs.
This means is that the methods returning Observables can be implemented using thread pools, non-blocking I/O, actors, or anything else. This is how ReactiveX and Akka will be used together: Actors are the concurrency implementation for services communicating with asynchronous messages.

Combining Akka and RxScala

I came up with the following short example. First I wrote a couple of methods returning an Observable to get a feel for it, then added the stockQuote() method which also uses an actor in it's implementation:


Running it produces the expected output, something like:

6
8
broken service
GOOG: 253.22

I can really see the potential in the Observable model, especially after reading more about it at The Netflix Tech Blog. If you were already using actors maybe combining them like this could make sense. I also need to checkout Akka Streams which seems like a similar idea.

UPDATE

Ray is a framework for parallelizing ML workloads. They use the actor model as a way of coordinating work and maintaining state.