Event streams are one of the fastest-growing data sources for modern organizations. Learn how to build high-performance architecture for event-stream analytics in our free white paper.
Nowadays the streaming approach to building software systems is more popular than ever. Event sourcing is an approach to data storing where instead of just the final result of data transformation, the whole chain of transformations is stored.
CQRS stands for Command Query Responsibility Segregation. It is a concept that can be tightly related to event sourcing. That’s why we will explore them both in this article.
What is Event Sourcing in Database
- Event sourcing patterns force the state of the object in the database as a sequence of events, events that modify the state of the object from the very beginning of its existence. There are several principles that create the basics of an event sourcing pattern:
- The events are immutable.
- There can be as many events for the given entity as needed. In other words, there is no limit on the number of events per object.
- Every event should have a name that represents the event’s meaning. For example, NewUserCreationEvent.
- In order to use the entity in the application (for example, show the name of the user in UI), we need to create a flat representation of the entity. This means that each time we want to use the entity, we should recalculate its current state using the sequence of state-changing events.
Very often, event sourcing is used in a bunch with CQRS. We will talk more about it later.
Benefits of Event Sourcing Architecture
Let’s briefly explore the advantages of building software using an event sourcing approach.
The most obvious gain is that the data can be restored as it was at a certain point of time. The system can restart the whole chain of events that transformed the data until a certain moment. It is useful for cases where you want to test many hypotheses/ideas/approaches to data processing. You can build one pipeline of events, test it, make necessary measurements, etc. Then create another pipeline of events, and let the system recompute the current state of the data using a new sequence of events. So, event sourcing makes the research and development process easier to some extent.
The next benefit is that all events are stored. In case of some failures or data corruption, it is possible to recover the current state of data by just applying an entire sequence of events to the corrupted entity. This means greater fault-tolerance.
Event sourcing provides more extensive data to analysts and data scientists. Useful insights can be derived not only from the final state of data but also from the history of its transformations.
But there can also be some challenges when using event sourcing. Since we need to recalculate the state of the entity each time, more processing power is needed. Also, storing information about all transformations may require more memory than just storing the last state of entities. But both issues can be mitigated. Nowadays cheap computing power and storage can be provided by clouds.
What is CQRS Pattern
CQRS promotes separation of commands and queries. This concept has a serious influence on the application’s architecture. Now the application should work independently with the “read” and “write” databases. There are actually two databases instead of just one compared to when the traditional CRUD approach is used. The idea behind CQRS is that the whole application will work better when we separate the responsibility between different parts of code and different elements of the system. For example, think about a blog application. There are significantly more queries that read data from the database than queries that write data into the database. This is why we should primarily aim to optimize the “write” part of the application. CQRS can help to do this.
How does CQRS work together with event sourcing? The part of the application that should update data, writes events to the sequences of events. One example of such a sequence can be a Kafka topic. So, the “write” part of the application just adds new events to the queue. Another part of the application (called an event handler) is subscribed to the Kafka topic. It reads the events, transforms data in accordance with them, and writes the final state of the data into the “read” database. The part of the application that is engaged in accessing data (“read” part) works directly with the “read” database. It just fetches the current state of entities, without concerns about how this state was computed. The main task of this part of the application is to make read queries fast.
CQRS Architecture Explained
Let’s look at the schema that demonstrates how CQRS architecture can be implemented in the app. On the first diagram, you can see an example of CQRS architecture without event sourcing.
There are components like user interface (UI), “read” and “write” databases, and “read” and “write” parts of the app. UI issues commands that should update data. These commands are processed by the “write” part of the system. It saves data in the “write” database. Simultaneously, this part of the app uses data from the “write” database to calculate the state of the data and write it in the “read” database. UI can then interact with the “read” database to fetch needed data.
But, as we have said before, CQRS architecture is almost always used together with event source patterns. So, let’s explore how the schema changes when both CQRS and event sourcing are used.
As you can see, the “read” part remains unchanged. The “write” database is now represented by the queue of events (event store). The “write” part of the application publishes events (commands) in the queue (it can be a Kafka topic, for example). The event handler is the component that consumes events from the event store, and using these events, it updates data in the “read” database. So, the current states of the entities are stored only in the “read” database. All the history of entity transformation can be extracted using a sequence of the events that is stored in the “write” database. CQRS architecture is implemented by a separation of responsibilities between commands and queries. The event sourcing principle is implemented by using the sequence of events to track changes of data.
Best Use Cases for CQRS Implementation
Let’s briefly talk about where the CQRS approach (with event sourcing) can be helpful. In general, the decision for using CQRS should be carefully discussed in a team. The thorough analysis of the pros and cons is required. But here are some recommendations:
- The complex business logic of the system makes the application of CQRS reasonable. CQRS keeps logic of data changes apart from reading data. This means different components of the system can evolve in their own way, be optimized as needed, and scaled according to specific needs. This approach resembles microservices architecture. CQRS makes the system more flexible, ready for scaling and changes, and easier to maintain.
- Another use case is when you know beforehand that scalability is very important for your system. One of the key benefits of CQRS is easier scalability. Fewer efforts are needed to work with isolated components of the system one by one without worrying about other components.
- Large and distributed teams of developers is one more argument for using CQRS. It is more convenient to split the work between teams when the elements of the system are loosely coupled. The most experienced developers can organize the work of the system as a whole, while less skilled developers will work on the specific components.
- If you want to test different logic of data processing, event sourcing with CQRS may be useful. It is especially true if the system has several components even without CQRS (for example, monitoring tool, searching app, analytics app, etc.). All these tools are consumers of data (or events), so event sourcing may be a good choice.
- You have the working system built with the traditional CRUD paradigm, but you have serious performance issues that are hard to solve without changing the architecture. Splitting the application into the “read” and “write” parts can help to improve performance. It allows focusing on the bottlenecks and the most critical points of the system.
In this article, we explored the concepts of CQRS (Command Query Responsibility Segregation) and event sourcing. We described what they are, touched on their benefits, and explored use cases. They are especially helpful when used in a bundle. But the necessity of their implementation should be carefully researched. An improperly constructed architecture setup can lead to serious performance issues.