Sequential Id Generation on Couchbase During Transition Sql to NoSql

Published in

Trendyol Tech

9 min readApr 28, 2024

The Challenge

When a decision is made to re-platform a legacy system, the most challenging situation encountered is changing the technological infrastructure while maintaining the current behavior of the system during the transition. If the database management system is included in the re-platforming and the transition is from SQL to a NoSQL database, there may be some difficulties encountered in maintaining system behavior. Sequential ID generation is one of these challenges.

In this article, we will discuss exactly this scenario and process the solution we implemented as the Delivery Core team.

Context

As the Delivery Core team, we have decided to reorganize our backend infrastructure, which we now call legacy and works with the MsSql database, to work with microservice architecture and the Couchbase NoSQL database.

During this process, one of the most critical elements for us was to be able to make the transition without affecting any of our customers and without disrupting system behaviors.

Until the process was completed, we needed a strategy to manage our data consistently and securely on both our legacy and new infrastructure.

Let’s examine our current infrastructure before moving on to our strategy

As the Delivery Core team, our primary responsibility is to develop the infrastructure that ensures the healthy delivery of products purchased or returned through the Trendyol ecosystem between the buyer and seller, working integrated with cargo companies. For this, we work integrated with many cargo companies.

We refer to the domain object of the delivery as Shipment. In our legacy system, we can observe the management of a Shipment data with the following visual. (The visual has been simplified for simplicity, created with the main lines without going into details.)

When a user purchases a product through Trendyol, a create-shipment request comes to our service named Internal-api. Internal-api performs validations according to the incoming request, selects the cargo company that will carry the shipment, and sends a request to our service named charon-api. After the shipment is successfully saved to the database, an event is thrown to Kafka with cdc. Our cargo integration services, which listen to the event, communicate with the cargo companies and update the Shipment data until the shipment is delivered. Shipment is stored in the database in a way that will generate a primary key with a sequential id.

Transition Strategy

As a team, we planned the transition to be based on the cargo company. So, in the first phase, we will manage the shipments on Couchbase, starting with UPS shipments, then FedEx, and finally DHL.

For this, we need two things.

A service that determines where the Shipment will be registered.
The services that update the Shipment need to know which database they will write data to.

We used the internal-api to solve the first problem. We told the internal-api which cargo company it should write to Couchbase with a json file that we can update dynamically.

public void createShipmentInternally(CreateShipmentRequest request) {

     if (isCargoRePlatformed(request.getCargoId())) {
         shipmentApiClient.createShipment(request); // To couchbase
     } else {
         charonApiClient.createShipment(request); // To Ms Sql
     }
}

public boolean isCargoRePlatformed(Integer cargoId) {
    return dynamicConfiguration.getRePlatformedCargos().contains(cargoId);
}

For the second problem, we set a primary key threshold. The primary key values of the last shipment data in the Ms Sql database were in the range of 1_000_000_000 < X < 1_900_000_000. As a result of our analyses, we decided to organize the key values of the Shipments to be created in Couchbase to start from 2_000_000_000. Now we will be able to understand whether the Shipment is in Couchbase or MsSql from the id value.

private static final long NEW_SHIPMENT_ID_SEQ = 2_000_000_000L;
public void updateShipment(UpdateShipmentRequest request) {
     if (request.getId() >= NEW_SHIPMENT_ID_SEQ) {
         shipmentApiClient.updateShipment(request); // To couchbase
     } else {
         charonApiClient.updateShipment(request); // To Ms Sql
     }
}

At this point, we encountered a new problem: Generating Sequential ID

This feature is directly provided in relational database systems like MsSql, but it is not directly available in NoSql databases designed in a distributed architecture like Couchbase. We needed to find a solution suitable for our system for this.

The Solution: Counter Documents

In Couchbase, a counter document is a special type of document used for storing and managing integer values that need to be incremented or decremented atomically. This feature is particularly useful for scenarios where you need to maintain counts, statistics, or other numerical data that multiple clients might concurrently update.

How works Counter Documents?

Document Structure: A counter document is a JSON document stored in the Couchbase database like any other document. However, it contains only one field, typically named something like “count” or “value”, which holds the integer value to be manipulated.
Atomic Operations: Couchbase provides atomic increment and decrement operations for updating the value of a counter document. This means that these operations ensure that increments or decrements are applied as a single, indivisible operation, even when multiple clients are concurrently attempting to modify the value.
Concurrency Control: Couchbase uses optimistic concurrency control to handle concurrent updates to counter documents. When a client wants to increment or decrement a counter, it sends a request to the Couchbase server with the desired operation. If multiple clients attempt to update the same counter simultaneously, Couchbase ensures that all updates are applied correctly and that no increments or decrements are lost.
Durability and Consistency: Couchbase provides configurable durability and consistency options for counter operations. You can choose the level of durability and consistency that suits your application’s requirements, balancing performance with data safety.
Performance Considerations: Counter documents in Couchbase are designed for high-performance scenarios where frequent increments or decrements are expected. Couchbase optimizes these operations for efficiency, making them suitable for use in a wide range of applications.
Usage: To use a counter document in Couchbase, you typically perform increment and decrement operations using the Couchbase SDK for your programming language of choice. The SDK provides methods or functions specifically for these operations, making it easy to integrate counter functionality into your application code.

Overall, Couchbase counter documents provide a convenient and efficient way to manage integer values that need to be incremented or decremented atomically in a distributed, concurrent environment. They are a powerful tool for building scalable and robust applications.

Now that we understand the Counter Documents, let’s see how we use them

The algorithm is quite simple. Before each new Shipment record, we go to our Counter document and get a new id, and we create the Shipment record with this new id.

To do this, we first created a new service called Shipment-Api that provides operations such as writing and updating shipments to be held on Couchbase.

Then, in our shipment._default._default (bucket.scope.collection) collection, we created a counter document with the key of ShipmentDao_SEQ. The value held by the document is 2_000_000_000.

Now, when a create shipment request comes to Shipment-Api, we can get the new id from our counter document named ShipmentDao_SEQ and create our record.

As can be seen from the image, thanks to the infrastructure we have designed, we can make a healthy transition based on the cargo company from MsSql to Couchbase db without disrupting the current flow.

Do we need to go to the Counter document for each data record? Can we make this more efficient?

Yes, we can bring it. Couchbase SDKs provide functions that allow us to increment intermittently for Counter documents.

// Return 2_000_000_500
// this number contains next 500 id for shipment 
binaryCollection.increment("ShipmentDao_SEQ", IncrementOptions.incrementOptions().delta(500)).content()

We chose 500 as the interval. When the Shipment-Api gets up, it goes to the ShipmentDao_SEQ counter document and takes the ids of the 500 Shipments that will be created and holds it in a variable. Now we have the currentId and lastId values in our hands. So now, instead of going to the ShipmentDao_SEQ document for the id at every data record, we go once in 500 data records.

Now let’s examine it in more detail with visuals.

As Trendyol, we run our services on Kubernetes clusters. Let’s set up our shipment-api service on 2 Kubernetes clusters with 3 replicas each.

Let’s create a counter document with a value of 2_000_000_000 in the shipment._default._default collection in our Couchbase cluster.

our services will try to start simultaneously in 6 separate pods and each will send a request to Couchbase immediately after starting, and they will get the number they can know how many documents they can create with the increment function. For the purpose of supporting the demo, let’s determine an order ourselves.

In Order

shipment-api-1
shipment-api-4
shipment-api-6
shipment-api-2
shipment-api-3
shipment-api-5

If we consider that the pods will send increment requests to Couchbase in order, our services will be as follows in the final situation.

Thanks to this algorithm, we have sent 6 increment requests to Couchbase to get the key values of 3000 shipment records. This has provided us with significant performance gains.

As can be seen from the image, now every service will be able to generate a sequential key value for shipment data in the range of currentId ≤ x < lastId without going to Couchbase.

When the Shipment-api comes to the last id it has, it will send a request to Couchbase again, get the next 500 ids, and continue to work in this way.

By following these steps, we have managed to all our data securely in the Couchbase database. The final state of the flow has become as follows.

Be Careful !

This solution is not a silver bullet. There is a behavior offered by the solution, a gap behavior between the data.

Let’s say at time T, the id of the last record in couchbase is 2_000_300_000, and the id of the last record created in the shipment-api-2 is 2_000_297_000. In this case, if we deploy a feature or if the pod restarts, the new instance will start recording data from 2_000_300_500.

We call this situation ‘gap’, which is the case when there is a different gap than we determined between the id values of the data we want to be recorded consecutively.

In the final situation, there will be about 3500 gaps between two data coming consecutively in order. This range varies depending on the determined delta number, the number of pods, and how often deployment is made. To keep this range small, the delta number should be kept low.

Conclusion

Changing the database management system is an action that brings many difficulties. In order to produce the most optimal solutions to these difficulties, it is necessary to learn well the features offered by the database to be transitioned to.

The counter documents offered by Couchbase are to generate atomic and consistent sequential ids for the data of a system in a distributed architecture.

With this solution, we are able to manage millions of delivery data in a more secure, consistent, and fast manner with the advantages offered by Couchbase.

Trendyol Careers

Would you like to be a part of our growing company? Join us! Check out our open positions and other media pages from the links below.