Clean up Axon Server Event Store

Simon Zambrovski
Holisticon Consultants
5 min readJan 31, 2024
Photo by Nick Fewings on Unsplash

One of the main concepts of Event Sourcing is the fact that the Events, being facts in history serve as the only source of truth. Any other information handled by the system is deduced from the stream of the events stored in a special place called Event Store. And because of this, the Event store is a special storage of events allowing to append events and load events for different criteria – allowing replays of streams for projections or loading of event-sourced aggregates.

In order to guarantee consistency in the system the events stored once should never be touched – engineers speak about append-only Event Store. The append-only Event Store is a very nice concept allowing to build highly scalable systems, but still introduces some issues in operations / production. Here are some obvious:

  • Storing relevant events forever will eventually fill any storage system – we eventually run out of space because of important data.
  • Data may become obsolete – having no ability to deal with this creates a lot of waste in your event store – we again eventually run out of space because of waste.

The problem

I want to skip the first issue and address the second one. So the question is: What needs to be done to clean-up the event store?

Well, in general (an advice by Greg Young in „Versioning in Event-Sourced Systems“):

  • Connect to Event Store
  • Filter events (copy relevant, drop obsolute)
  • Write new Event Store
  • Switch your system to the new Event Store

In practice, AxonIQ introduced the Event Transformation API starting by Axon Framework 4.9 and Axon Server 2022.x. As there is a nice article about it written by Mitchell explaining the main concepts, I‘ll only reference the relevant parts shortly.

The use case

Imaging an artificial application that is responsible for storage of temperature forecasts and it periodically receives it from some external forecasting service. Since the users are only interested in current forecast, after a new forecast is delivered, the old one is considered as obsolete. The forecast is always delivered for one year, and as you can image, if a new forecast for example produced every day, the event store is filled with a lot of waste – interesting for the forecasting service in terms of quality control, but entirely irrelevant for the users of the system.

The idea

Let us use the Axon Framework Event Transformation API to clean up the event store.

The experiment

I built a trivial Spring Boot 3 application, using Axon Framework 4.9 via Axon Spring Boot Starter and Axon Server 2023.2.2 as event store. Want to see the code first? Check it out on GitHub: https://github.com/holixon/axon-event-transformation-example. Using an artificial simulation REST endpoint I can create a lot of forecast data resulting in a lot of events. Events are stored in JSON and look like this:

{
"sensorId": "temp1",
"values": {
"2025-01-16": {
"value": 23.363865,
"unit": "°C"
},
"2025-01-17": {
"value": 23.188648,
"unit": "°C"
},
"...",
"another 363 values for dates of the next year"
}
}

Such an event payload consumes 16415 bytes. To make it more interesting, I configured Axon Server to create small segment files by setting the segment size to be 1000000 bytes (977kb) instead of 256mb. After playing around a bit I found out that 58 events fit into one segment, resulting in 16845 bytes for every event (with metadata). For our experiment, I only consider disk space used by the event segment files (ignoring snapshot files and index files).

A fresh empty event store consisting of one segment looks like this:

977K Jan 17 22:40 00000000000000000000.events

If I simulate updates of a whole year producing 365 forecasts, the batch of 365 events is consuming 16845 bytes * 365 = 6148425 bytes, resulting in not quite full seven segments per batch run.

The resulting event store (consisting of one sensor created event and 365 update events:

977K Jan 17 22:44 00000000000000000000.events
977K Jan 17 22:45 00000000000000000059.events
977K Jan 17 22:45 00000000000000000117.events
977K Jan 17 22:45 00000000000000000175.events
977K Jan 17 22:45 00000000000000000233.events
977K Jan 17 22:45 00000000000000000291.events
977K Jan 17 22:45 00000000000000000349.events

Now I execute a second run further 365 events staring from 2024–01–17T21:48:22.384553762Z (13 segment files in total)

977K Jan 17 22:44 00000000000000000000.events
977K Jan 17 22:45 00000000000000000059.events
977K Jan 17 22:45 00000000000000000117.events
977K Jan 17 22:45 00000000000000000175.events
977K Jan 17 22:45 00000000000000000233.events
977K Jan 17 22:45 00000000000000000291.events
977K Jan 17 22:48 00000000000000000349.events
977K Jan 17 22:48 00000000000000000407.events
977K Jan 17 22:48 00000000000000000465.events
977K Jan 17 22:48 00000000000000000523.events
977K Jan 17 22:48 00000000000000000581.events
977K Jan 17 22:48 00000000000000000639.events
977K Jan 17 22:48 00000000000000000697.events

The next step is to clean up all events before the second run (ingested before 2024–01–17T21:48:22.384553762Z). For doing so I used the following code:

EventSources
.range({ connection.eventChannel() }, firstToken, lastToken)
.filter { eventWithToken ->
logger.debug { "Event: ${eventWithToken.event.payload.type}" }
eventWithToken.event.payload.type == eventFQDN
&& eventWithToken.event.timestamp < deleteUntil.toEpochMilli()
}
.transform("Deleting events until $deleteUntil") { event, appender ->
appender.deleteEvent(event.token)
}
.execute { connection.eventTransformationChannel() }
.get()

As a result, Axon Server copies all segments in which the events have been modified (in this case deleted):

3,9K Jan 17 22:50 00000000000000000000_00003.events
977K Jan 17 22:44 00000000000000000000.events
3,6K Jan 17 22:50 00000000000000000059_00003.events
977K Jan 17 22:45 00000000000000000059.events
3,6K Jan 17 22:50 00000000000000000117_00003.events
977K Jan 17 22:45 00000000000000000117.events
3,6K Jan 17 22:50 00000000000000000175_00003.events
977K Jan 17 22:45 00000000000000000175.events
3,6K Jan 17 22:50 00000000000000000233_00003.events
977K Jan 17 22:45 00000000000000000233.events
3,6K Jan 17 22:50 00000000000000000291_00003.events
977K Jan 17 22:45 00000000000000000291.events
668K Jan 17 22:50 00000000000000000349_00003.events
977K Jan 17 22:48 00000000000000000349.events
977K Jan 17 22:48 00000000000000000407.events
977K Jan 17 22:48 00000000000000000465.events
977K Jan 17 22:48 00000000000000000523.events
977K Jan 17 22:48 00000000000000000581.events
977K Jan 17 22:48 00000000000000000639.events
977K Jan 17 22:48 00000000000000000697.events

By this transformation the event payload and event metadata is erased (displayed as empty in Axon Server Console). But as you can see, the old evens segment files are still in place. Event transformation has not deleted any data but rather created a copy of the event store. So temporarily, the consumption of disk space has increased.

The next step is compacting – a step of real deletion of event segment files from the disk. The result is the following:

3,9K Jan 17 22:50 00000000000000000000_00003.events
3,6K Jan 17 22:50 00000000000000000059_00003.events
3,6K Jan 17 22:50 00000000000000000117_00003.events
3,6K Jan 17 22:50 00000000000000000175_00003.events
3,6K Jan 17 22:50 00000000000000000233_00003.events
3,6K Jan 17 22:50 00000000000000000291_00003.events
668K Jan 17 22:50 00000000000000000349_00003.events
977K Jan 17 22:48 00000000000000000407.events
977K Jan 17 22:48 00000000000000000465.events
977K Jan 17 22:48 00000000000000000523.events
977K Jan 17 22:48 00000000000000000581.events
977K Jan 17 22:48 00000000000000000639.events
977K Jan 17 22:48 00000000000000000697.events

As you can see, an „empty“ event segment consumes almost no disk space.

Conclusion

Using event transformation API feels good after the first shock of „modifying the append-only event store“ is over. It can be used to radically reduce the disk consumption in case of stale or obsolete data. In the same time – beware of deleting to much, you can‘t restore it anymore once it is really gone.

Example Code

The example I provided above is checked in into GitHub, so you can play with it. Check out this repository: https://github.com/holixon/axon-event-transformation-example

--

--

Simon Zambrovski
Holisticon Consultants

Senior IT-Consultant, BPM-Craftsman, Architect, Developer, Scrum Master, Writer, Coach