HomeTechnologyActor Model and Event Sourcing. Optimistic locking is not the only way…...

Actor Model and Event Sourcing. Optimistic locking is not the only way… | by Andrzej Ludwikowski

Andrzej Ludwikowski

When I try to elaborate to my kids, parents, or simply people without an IT background, what my job is, I often use the analogy that developing software is like building with Lego bricks. The difference is that you have limitless brick sources and the power to create your own brick shapes — every kid dreams of. From time to time, in the software development area, you can find 2 (or more) pieces that really fit well together. The combination creates a very powerful, but, at the same time, elegant and pretty resolution. This post is about such a pair. It’s about the thought — an Event Sourcing pattern and implementation — based on the Actor Model. Let’s take a deep dive into this combination and examine how these elements go together.

Actor Model and Event Sourcing Optimistic locking is not the
https://unsplash.com/photos/TaEd6ndkRWM

Since you are reading this post, I assume that you are already pretty acquainted with the concept of Event Sourcing. I don’t want to repeat many existing articles about it, just do a brief summary. From the software perspective, Event Sourcing is constructed on top of 3 main building blocks:

  • Commands — define what we want to happen in the system,
  • State — it’s usually an aggregate from the DDD approach, which is responsible for keeping some part of the system constant and legitimate (aggregate invariants)
  • Events — capture what has occurred in the system.

The state/aggregate usually needs to provide 2 entry point methods:

The first 1 is for processing the command. This is the place for business validation, checking invariants, etc. If everything is ok, then we can return a list of events that will be continued. Because we want to also capture the situation when the command is not legitimate, the return type of this method is usually . Hiding domain errors by throwing exceptions is a bad practice and we should use types for such cases. What is necessary here is that this method does not mutate the state.

For state mutation, we use only the second method. Since rebuilding the state can be done many times (e.g. after application restart) this method should be free of any side effects, otherwise, the system could behave in an unpredictable way. Separating the mutation from method is not only useful for replaying events. We can mutate the state if we know that the events were successfully continued. Only then we can apply events and get the new version of the state, possibly as an immutable object. Of course, I’m not saying that you should implement everything in just 2 methods. Those are only ports (from the hexagonal architecture perspective) required by most of the Event Sourcing implementations.

The basic flow of Event Sourcing is captured on the diagram below:

Event Sourcing basic flow

The actual implementation of the Event Sourcing pattern will depend on the technology stack, but there will be some similarities. First, you need an implementation of the Event Store. The most basic implementation will require only 2 methods:

interface EventStore{
void saveEvents(List<Event> events);
List<Event> getEventsFor(StateId id);
}

One for saving events, and 1 for replaying events to rebuild the state. Our application is used by many users at the same time and to avoid concurrent DB updates method will most likely do some optimistic locking (for the same aggregate) on the underlying database. Such an approach would require an additional desk for locking the version number. Everything works just fine in the beginning, but…

This kind of implementation has some major problems.

The first 1 is of course optimistic locking based on the database. We are so used to it. The dispute of concurrent updates is not managed by our code. We don’t like to handle such problems. They are too complex. It’s better to move it someplace else — the database layer. Been there, done that.

With time, when your application needs to handle more and more load, you will pay a price for such an approach. Optimistic locking by definition will work acceptably only if chances for concurrent modifications are very low. Such aggregates like User, ShoppingCard, or identical are not updated by many requests at the same time. On the other hand, OrderBook, OnlineAuction or StoreWarehouse are continuously updated by many different sources.

Optimistic locking for such domains will fail even with a very average load. To mitigate this dispute, the only thing you can do is to put some message queue between the caller and your aggregate, which is a very fine thought, but not always possible to implement. Most likely, it also requires shifting from synchronous communication to asynchronous, and from my experience, this is very often a blocker.

If the message queue is out of range, then we put more focus on the database itself. Can it scale? Of course 🙂 Do we have some budget limits? Of course 🙁

Even with an limitless budget, vertical scaling of a single host database cannot be continued forever. How about migrating to some distributed, easy-to-scale, and cheaper database? Now things are getting a lot more tricky. The first surprise you might notice is that the support for optimistic locking, so basically to perform a query like:

update … version = x+1 if version = x;

is not so common. Yes, you can find it in all SQL databases, but once you leave this area, you are compelled to gaze for a leader — replica architectures. Unfortunately, this will not solve your performance problems since all writes should go through the leader.

Leaderless databases like Cassandra, DynamoDB, will not give you fine (if any) support for optimistic locking. Don’t be fooled by some nice slogans like lightweight transactions. This kind of locking is not counseled by default. It’s very heavy for the database cluster and also very restricted when it comes to the actual use cases. It’s hard to require optimistic locking from a distributed database where you barely have support for transactions.

As you can see, this “optimistic” shortcut has a lot of implications in the prospective. As always, but what can we do about this? Let’s handle concurrency where it should be handled — in the application layer. OMG, does it mean that I will need to write some awful synchronized code or some crazy concurrent abstraction that I barely understand? No, it’s time for the Actor Model.

Here you can read about the theory behind the Actor Model, but I suppose you want to know how to use it and why it’s worth spending some time to learn it.

That’s why we will start with the actor notion itself. In a nutshell, an actor is a very clever abstraction that will help you to easily construct very complex concurrent systems. You can gaze at it as a protector of your state. You don’t have a reference to the state. You have the reference to the actor and the only thing you can do is to send a message to it.

This message will first go to the actor’s mailbox and when the actor is ready, such a message will be consumed. Consuming a message can lead to:

  • state mutation,
  • and/or reply to the caller (also an actor) with another message,
  • and/or changing the actor behavior.

Actors live in an Actor System — a hierarchical structure, identical to the tree.

Actors hierarchy

There will be only 1 actor with a given name/id in the whole actor system (at least in Akka implementation). It doesn’t matter from which part of the system (API controller, external message queue consumer, etc.) you will call the actor. It will be the same actor. Someone could call it a stateful singleton. Although it’s true, I don’t like this metaphor because singleton per se is a concept with very bad connotations. State guard/protector sounds much better.

In other words, you don’t have to worry about the concurrent mutation of the state because an actor will consume 1 message at a time, so basically, you have non-blocking, thread-safe, sequential mutations in a concurrent world.

Actor protecting the aggregate

How does this concept fit the Event Sourcing implementation? In many ways. Messages sent to the actor are basically the instructions from ES. The actor is protecting the state and if you use its full capacities like stashing the messages, you can also achieve the Single Writer Principle per aggregate/state.

The contention, which is obligatory when you need strong consistency, is now restricted only to the scope where it is necessary (keeping the aggregate invariants). In contrast to the optimistic locking where you need to apply this on a desk/rows level. Not to mention that you can talk with your database in a non-blocking way. The next command (thanks to stashing) will be processed once the actor receives a response from the database that events were continued.

Sounds interesting? There is more. Since you can handle data consistency on the application level, you are not so restricted with the DB choice. You can actually use any DB with the ability to save all events atomically. Full ACID is no longer desired. Finally, you can use truly distributed databases like Cassandra, DynamoDB, Couchbase, Spanner for something else than read models.

That’s not all. One of the biggest problems in Event Sourcing is reloading the state from events. Sometimes it can take a while. Of course, you can use snapshots to velocity up the process, but it will nonetheless require calling the database. With actors, once you load the state, you don’t have to do this for the second time, since all the data is already in memory. Waiting to handle the next command. In contrast to the optimistic locking approach, where you need to load the state each time. Yeap, you just got a write-through cache completely for free. Once you received a message from the actor, you have a guarantee that events were continued and you have the most current state in memory. Imagine how your database will love you if most of the operations generated from your application are append-only writes (saving events). Any database can handle such load extremely efficiently.

Ok, enough with the theory, show me the code. I have another surprise for you. You don’t have to write all this (pretty complex) code. All you need to do is use Akka Persistence Typed API:

But I don’t see any actors here. Yes, with the new typed API you just need to provide the actor behavior. A classic approach is nonetheless available, but my recommendation is to work with types. To expose this as a nice service, you will need to wrap the domain command with an Envelope and change a bit the Entity implementation, but you should end up with something identical to:

Remember this is Java code. In Scala, everything looks even nicer, definitely more concise 😉 The UserService methods are non-blocking and you can easily combine them with your favorite reactive framework responsible for the API layer, Spring WebFlux, Micronaut, Quarkus, etc. It’s a common mistake to think that you will need to use the Akka stack for every part of your application. Trust me, it will be very refreshing to mix those stacks and create an application in a new way. Application, where persistence is not managed by another DAO implementation. Where your domain is free from any framework/libraries (including Akka).

When it comes to actual database integration, all you have to do is to choose and configure the right plugin in .

akka.persistence.journal.plugin = "jdbc-journal"

You can use formally supported plugins, which I recommend, or the ones created by the community, the full list is here. This could be the first time when you will actually change your underlying DB without changing the application code. You can start with a fine, old SQL database and when you hit the scaling limit, migrate to something distributed. Besides choosing the right plugin, remember to spend some time and analyze serialization options for events/state.

Of course, I made some shortcuts while explaining the powerful combination of Event Sourcing and actors. If you are nonetheless not convinced, I hope that, at least, you are interested to find out more about this topic. Other takeaways from this post are as follows.

Optimistic locking is nonetheless locking

Remember there is life past optimistic locking. I noticed that many domain decisions, like aggregates size, etc. are driven by the fact that you will need to handle optimistic locking. I heard this many times, aggregates should be small, otherwise, the optimistic locking transaction will be too huge and sluggish. Not all databases support optimistic locking. Fortunately, there are other ways to achieve consistency. With actors (and Event Sourcing) your aggregate can be modeled without this constraint. Your code is simpler and describes the actual world better.

Distributed Actors

If you like the thought of a Single Writer Principle via actor, hold in mind that you can expand it further and get the Distributed Single Writer Principle with Akka Sharding and Akka Cluster, you can read more about this in 1 of my previous blog posts. Trust me, scaling the write path of the Event Sourcing implementation is a very complex area. As I mentioned earlier, you can choose a different strategy and put a (scalable) message queue between the caller and the aggregate, but this option may introduce hard to handle nook cases. Obviously, with Akka actors, you won’t be free from problems and errors as well, but at the same time, you will get much more. Check out topics like Distributed Data or multi-dc Replicated Event Sourcing, very powerful functionalities.

Want more?

Would you like to try it, but you are not so comfortable with the new stack? Or you don’t have the time to learn it and make all possible mistakes at the beginning? Contact SoftwareMill Academy and we will create a dedicated workshop for you. If you would like to see more code, I’m planning to release a series of videos where we will go step by step through all the nuances of Reactive Event Sourcing implementation. Join the mailing list to make sure you won’t miss any of it.

Go to the source

Most Popular