Oid Data Type Conversions At Axon Framework

Ibrahim Ozturk
Trendyol Tech
Published in
6 min readDec 12, 2022

As the wallet team in Trendyol, we use the Axon framework to implement Event Sourcing and CQRS. Although Axon Framework has many advantages that it provides us, there are parts that we need to configure according to our needs. One of them is that the data that Axon Framework keeps as LOB(Large Object) in some tables are kept in the OID data type in the default entity mapping used in JPA. As our system grew, the OID data type was insufficient for our system, so we had to make improvements to convert LOB objects to other data types. Before we move on to the problem and implementation, let’s briefly discuss the OID data type and how it works.

What is the OID data type?

OID is a data type that stands for Object Identifier and is used to hold large objects. The real data is mainly kept in the system tables. The key value corresponding to the real data is kept in the user tables. The value held in User tables can be thought of as a pointer. In order to get the actual data corresponding to this key in the User table, the query Select lo_get(id value in the user table) can be run into the database. There are many advantages and disadvantages to this use.

Advantages:

  • As only the pointer value is kept in the user tables, we ensure that the user tables are grown well-formed.
  • It facilitates the estimation of the size of the data.
  • The table's record size and block size are regular and predictable.

Disadvantages:

  • In order to access the real data, we need to make an extra query.
  • There is a dependency on the system tables where the real data is kept.
  • Not human readable because user tables hold only pointers to real data
  • Difficult to debug
  • In the new versions, Postgres gradually switches the OID data type from default on mode to default off mode.

In addition, since OID consists of unsigned four-byte integer values, possible key value space is limited. It may not have a unique value in large databases. Therefore, it would be wrong to check for uniqueness in them.

Where does the Axon Framework use the OID data type?

Axon Framework can use JPA for Event Storage Engine. If the schema creation capability is used with Hibernate JPA, LOB data fields are generated as OID data type by default in the Postgres databases. So what are these fields?

  • payload and meta_data in domain_event_entry
  • payload and meta_data in snapshot_event_entry
  • token in token_entry
  • serialized_saga in saga_entry
  • diagnostic, meta_data, payload, and token fields in dead_letter_entry are kept in the database as OID.

Problems we encountered while using OID:

  • Real values and key values are kept in different fields in the database when using the OID data type. In order to access the actual value, it is necessary to run another query with the key value. Therefore, our query cost increases. When we think about Axon Framework, in particular, this situation seriously reduces performance as too many events are queried during the loading phase of Aggregates. Frequent snapshots are required to avoid this loss of performance.
  • Since the OID data is kept in the database in system columns, the size of the data here cannot be easily estimated. It becomes difficult for DBAs to calculate the growth rate of this data.
  • As the data size increases, the backup times of these data are longer, and vacuum operations are adversely affected.
  • It is challenging to authorize different DB users as the ownership and access privileges of OID fields are given at the level of each OID object.

We had to face problems like these caused by the use of the OID data type. Now, let's examine these problems and solution options through a simple oid-type-conversion application.

Alternative Solutions

Before moving on to the solution suggestions, let's introduce the application where you can experience this problem and the solutions. Our application is created with the default settings in the master branch of the application. In this section, you can see that the fields that the Axon Framework has determined as LOB have been converted to OID data type with JPA, by running ApplicationTests or by running the application and sending a request from Swagger by looking at the DDL(Data Definition Language) in the database. We implemented the conversions of the OID data type to bytea, jsonb, and XML in separate branches. Let's examine these transformations now.

Using bytea array instead of OID:

We started with bytea to examine alternative data types that we can use instead of the OID data type. We researched how to implement the conversion from OID to bytea. We have made this change by arranging the Hibernate dialect to our needs. As seen in the code snippet below, we have created our custom Hibernate dialect and set the column types whose type is Blob as bytea.

Then, we gave the ByteaSQLDialect we created to the hibernate dialect part in the application.yml file of the project.

After this change, the fields that we said to be kept as OID above are now bytea. For example, when we examine the DDL of Domain Event Entry, we see that the payload and metadata fields are bytea.

You can find these changes in the bytea branch of our oid-type-conversion project.

Another solution is to use XML instead of OID:

We will do the XML conversion of OID fields, specifically over the fields in the Domain Event Entry. Here we need to do hibernate mapping in addition to creating a custom dialect that we use in bytea conversion. We create the Xml.hbm.xml file for the Domain Event Entry, for which we will change the data type in the project. We set the SQL types of the payload and meta values ​​that we will change in this field as XML. In addition, we give the XMLUserType value that we created ourselves as the type to the fields to be changed. In this user type, we give SQLXML as SQL type. Finally, we export the Xml.hbm.xml file you created to JPA as mapping resources. You can find the implementation for this in the xml branch of our app.

You can see that the payload field is converted to XML by running the test in the ApplicationTests class in this branch. Also, when we run the xml branch, we see that the payload and metadata fields are set as XML in the DDL of the domain event entry table in the database.

The last solution is to use jsonb instead of OID:

In this solution, we have done similar to XML conversion. We created the jsonb.hbm.xml file. Here we set jsonb as SQL-type in the payload and metadata fields that we want to be jsonb. We gave the JsonbUserType value we created to these values ​​as type. Since there is no Jsonb value as SQL type in JsonbUserType, we have given the Other value. Also, since we are working with json data, application.yml is also a axon.serializer.general: jackson we have given the Serialization method as jackson. You can examine the implementation in more detail in the jsonb branch of the application.

When the DDL of the Domain event entry in the database is checked, it will be seen that the payload and metadata fields are kept as jsonb. You can also see this status by running ApplicationTests.

As a result, we decided to use bytea after evaluating these three solutions. So why did we choose bytea? The payload value in snapshot_event_entry increases in this field as the number of events increases, as it keeps all the events that that user has done before. Therefore, the data type we need to use here should be suitable for holding large data. Bytea columns seemed convenient for us as they could hold about 1GB of data. It is also easier to implement than other solutions.

Conclusion

When developing applications, we depend on many frameworks or many technologies. One feature that makes these technologies stand out is that we can configure them according to our system. Since the Axon Framework we use is also configurable, we can make changes according to our needs. In addition, it is one of the essential advantages that it integrates with known technologies such as Spring and Hibernate. As a result, we should configure the technologies we use according to our needs instead of using them directly with default settings.

Please don't hesitate to ask questions. 🚀

--

--