How Coupang built a microservice architecture
Part I of a two-part series about our transition from a monolithic architecture to a microservice architecture
By Jaehoon Jeong
This post is also available in Korean.
In its inception, Coupang was composed of a single monolithic architecture. Our services were encapsulated in neat modules that were tightly coupled, deployable as a large unit. However, as the number of Coupang customers grew into the tens of thousands and the number of engineers to hundreds, the monolithic architecture became an impediment to our accelerating growth rate, largely due to issues with scalability and separation of concerns.
Confident that business growth would not slow anytime soon, we decided to transition from a monolithic architecture to a microservice architecture in late 2013. In this post, we discuss the motivations behind our decision to shift to a microservice architecture and the strategies we developed for a swift and secure transition of such a major re-architecture process.
Monolithic architecture
A monolithic architecture is a traditional software program design where all the components that make up a service that are not separated but instead exist as large modules with well-defined interfaces, each interconnected to another. It has the benefit of being quick and easy to launch at a low cost while also being quickly adaptable to customer feedback.
For these reasons, Coupang also started out as with a monolithic architecture composed of Apache and Tomcat servers. At this phase, we had a single Git repository that contained all the components of our service, typical to a traditional monolithic architecture. The structure of our repository is shown below.
Release/Coupang
├── coupang-api
├── coupang-batch
├── coupang-calculate
├── coupang-common
├── coupang-front
├── coupang-login
├── coupang-mobile-api
├── coupang-order
└── coupang-wing
Although this design worked for us in the early phase of our business, as the number of users and the amount of data grew exponentially over the years, we ran into five significant pain points with the monolithic architecture. Below, we will discuss these five pain points in more detail.
Extended risk and low reliability
Because of the tightly coupled nature of a monolithic architecture, a failure in one component often cascaded to other components and led to a disruption of the whole system. For example, a code that caused an out of memory error in the Order component led to a memory failure of the entire server. Minor code errors or sudden spikes in traffic could lead to catastrophic server and service failures.
Poor separation of concerns
The Git repository had a coupang-common
branch where common modules used across multiple teams were collected, for convenience of maintenance. However, as our engineering organization grew, code in coupang-common
became legacy codes. Distinguishing code blocks that were in use from ones that were not became difficult, and there were poor rules for management. To edit a single method, an engineer had to send out a company-wide email. Engineers resorted to simply copying code blocks, creating an even larger and confusing codebase.
Poor scalability
In a monolithic architecture, scaling-out entails scaling-out the entire platform. This may be achieved at a high cost by securing additional servers to scale the entire system, but such a scale-out method is not only expensive but also inefficient. Simply expanding the number of server instances does not resolve fundamental problems in the architecture that cause bottlenecks. Specific examples of bottlenecks are discuss in the next two pain points.
Cost ineffective testing
For security reasons, our engineers run unit and regression tests for every new feature. Small and large modifications to code alike were all tested by running the entire codebase. As our codebase and number of features grew, the cost of testing even the smallest modifications increased substantially.
Inefficient deployment
A bottleneck also occurred in the deployment process for the same reasons. Previously, to ensure secure and orderly deployment, we used deployment flags. Only teams with the deployment flag could modify and deploy the codebase. However, as hundreds of engineers joined us and tens of features were developed and modified daily, deployment was delayed greatly — so much so that an engineer who edited a single line of code had to wait three days to deploy it.
Microservice architecture
Because of the pain points listed above, the monolithic architecture was viewed as the single largest hurdle blocking Coupang’s growth. To resolve these issues and accelerate our business expansion, we began Vitamin Project, a roadmap designed to support our transition from a monolithic architecture to a microservice architecture.
A microservice architecture is structured as a collection of services with narrow functions that are loosely coupled, meaning these services communicate with each other through the network. Although the overall structure may be more complex, a microservice architecture has the advantage of being independently deployable and testable.
In this section, we detail the strategies adopted for a secure and timely transition that would preserve the legacy system while not disrupting business.
Framework
We first designed the Vitamin Framework to operate the microservice architecture. The Vitamin Framework is a Java-based system that includes Coupang legacy libraries and standard skeleton code templates that support development in the frontend, API, and backend domains. Using our framework, each domain team can easily test, deploy, monitor, and automatically recover services and also sync with other internal platform services such as message queue and cache. The Vitamin Framework provides teams with the technical foundation to focus on realizing business logic.
Client-side helper library
In a typical microservice architecture, the separated domains use RESTful API to communicate with each other. To use a specific API, all clients must create separate modules for HTTP communication, make a JSON-type API request, and map it to an object. For example, if ten domain services use the product API, each team must implement its own module. To reduce such redundancy, we developed a library that provides users with API-calling modules. This library is dependent on API versions, and can become a monolithic system when versions grow complex. Despite this shortcoming, this system worked for us at Coupang.
Message queue
In a monolithic architecture, the components of the application are tightly coupled. For example, when an order is placed on the Coupang app, a payment and delivery request is made. These multiple steps are processed as a single long transaction, which means that if an error occurs in any step, the whole process is disrupted. We wanted to decouple these interconnected services in the backend but still have them work like a single transaction on the frontend — no easy feat.
To solve this difficult task, we engineered an in-house message queue platform called Vitamin MQ. Vitamin MQ converts transactions as messages for all microservices in a safe and error-proof manner. When an order occurs, Vitamin MQ generates a message, or event, that instructs a delivery request and so on. Transactions that were previously connected are broken up into loosely coupled smaller events, fitting of a microservice architecture. Vitamin MQ not only decouples transactions, but it also improves fault-tolerance, as messages that are not completed are automatically forwarded to the dead letter queue, where they are configured for reprocessing after service recovery.
Conclusion
In this post, we discussed the five major pain points we faced in a monolithic system and the three strategies we developed for a robust and speedy transition to a microservice architecture. Using our Vitamin Framework and other services, our engineers at Coupang develop, test, and deploy at smaller domains, increasing engineering efficiency.
We believe that when a company surpasses a certain point in its business, a monolithic architecture blocks growth. For other companies facing extremely fast growth like Coupang, a transition to a microservice architecture is not only a recommendation, but a must.
Series index
This is part 1 of a two-part series about our transition from a monolithic architecture to a microservice architecture.
Part 1 — How Coupang built a microservice architecture
Part 2 — Integrating platform services to a microservice architecture
If you believe you can contribute to engineering and improving a complex microservice architecture, see our open positions.