How Coupang built a microservice architecture

Part I of a two-part series about our transition from a monolithic architecture to a microservice architecture

Coupang Engineering
Coupang Engineering Blog
17 min readMay 25, 2018

--

By Jaehoon Jeong

This post is also available in Korean.

In its inception, Coupang was composed of a single monolithic architecture. Our services were encapsulated in neat modules that were tightly coupled, deployable as a large unit. However, as the number of Coupang customers grew into the tens of thousands and the number of engineers to hundreds, the monolithic architecture became an impediment to our accelerating growth rate, largely due to issues with scalability and separation of concerns.

Confident that business growth would not slow anytime soon, we decided to transition from a monolithic architecture to a microservice architecture in late 2013. In this post, we discuss the motivations behind our decision to shift to a microservice architecture and the strategies we developed for a swift and secure transition of such a major re-architecture process.

Monolithic architecture

The monolithic architecture at Coupang up until 2013
Figure 1. The monolithic architecture at Coupang up until 2013

A monolithic architecture is a traditional software program design where all the components that make up a service that are not separated but instead exist as large modules with well-defined interfaces, each interconnected to another. It has the benefit of being quick and easy to launch at a low cost while also being quickly adaptable to customer feedback.

For these reasons, Coupang also started out as with a monolithic architecture composed of Apache and Tomcat servers. At this phase, we had a single Git repository that contained all the components of our service, typical to a traditional monolithic architecture. The structure of our repository is shown below.

Release/Coupang
├── coupang-api
├── coupang-batch
├── coupang-calculate
├── coupang-common
├── coupang-front
├── coupang-login
├── coupang-mobile-api
├── coupang-order
└── coupang-wing

Although this design worked for us in the early phase of our business, as the number of users and the amount of data grew exponentially over the years, we ran into five significant pain points with the monolithic architecture. Below, we will discuss these five pain points in more detail.

Extended risk and low reliability

In a monolithic architecture, a failure in the Order component led to a failure of all components.
Figure 2. In a monolithic architecture, a failure in the Order component (top) led to a failure of all components (bottom).

Because of the tightly coupled nature of a monolithic architecture, a failure in one component often cascaded to other components and led to a disruption of the whole system. For example, a code that caused an out of memory error in the Order component led to a memory failure of the entire server. Minor code errors or sudden spikes in traffic could lead to catastrophic server and service failures.

Poor separation of concerns

The Git repository had a coupang-common branch where common modules used across multiple teams were collected, for convenience of maintenance. However, as our engineering organization grew, code in coupang-common became legacy codes. Distinguishing code blocks that were in use from ones that were not became difficult, and there were poor rules for management. To edit a single method, an engineer had to send out a company-wide email. Engineers resorted to simply copying code blocks, creating an even larger and confusing codebase.

Poor scalability

In a monolithic architecture, scaling-out entails scaling-out the entire platform. This may be achieved at a high cost by securing additional servers to scale the entire system, but such a scale-out method is not only expensive but also inefficient. Simply expanding the number of server instances does not resolve fundamental problems in the architecture that cause bottlenecks. Specific examples of bottlenecks are discuss in the next two pain points.

Cost ineffective testing

For security reasons, our engineers run unit and regression tests for every new feature. Small and large modifications to code alike were all tested by running the entire codebase. As our codebase and number of features grew, the cost of testing even the smallest modifications increased substantially.

Inefficient deployment

A bottleneck also occurred in the deployment process for the same reasons. Previously, to ensure secure and orderly deployment, we used deployment flags. Only teams with the deployment flag could modify and deploy the codebase. However, as hundreds of engineers joined us and tens of features were developed and modified daily, deployment was delayed greatly — so much so that an engineer who edited a single line of code had to wait three days to deploy it.

A visual representation of the deployment bottleneck at Coupang with the monolithic architecture
Figure 3. A visual representation of the deployment bottleneck at Coupang with the monolithic architecture

Microservice architecture

Because of the pain points listed above, the monolithic architecture was viewed as the single largest hurdle blocking Coupang’s growth. To resolve these issues and accelerate our business expansion, we began Vitamin Project, a roadmap designed to support our transition from a monolithic architecture to a microservice architecture.

A microservice architecture is structured as a collection of services with narrow functions that are loosely coupled, meaning these services communicate with each other through the network. Although the overall structure may be more complex, a microservice architecture has the advantage of being independently deployable and testable.

In this section, we detail the strategies adopted for a secure and timely transition that would preserve the legacy system while not disrupting business.

Framework

We first designed the Vitamin Framework to operate the microservice architecture. The Vitamin Framework is a Java-based system that includes Coupang legacy libraries and standard skeleton code templates that support development in the frontend, API, and backend domains. Using our framework, each domain team can easily test, deploy, monitor, and automatically recover services and also sync with other internal platform services such as message queue and cache. The Vitamin Framework provides teams with the technical foundation to focus on realizing business logic.

Client-side helper library

The architecture of the Coupang’s API-adapter
Figure 4. The architecture of our API-adapter

In a typical microservice architecture, the separated domains use RESTful API to communicate with each other. To use a specific API, all clients must create separate modules for HTTP communication, make a JSON-type API request, and map it to an object. For example, if ten domain services use the product API, each team must implement its own module. To reduce such redundancy, we developed a library that provides users with API-calling modules. This library is dependent on API versions, and can become a monolithic system when versions grow complex. Despite this shortcoming, this system worked for us at Coupang.

Message queue

Tightly coupled architecture of the monolithic architecture and the loosely coupled architecture of the microservice architecture with message queue
Figure 5. Tightly coupled architecture of the monolithic architecture (top) and the loosely coupled architecture of the microservice architecture with message queue (bottom)

In a monolithic architecture, the components of the application are tightly coupled. For example, when an order is placed on the Coupang app, a payment and delivery request is made. These multiple steps are processed as a single long transaction, which means that if an error occurs in any step, the whole process is disrupted. We wanted to decouple these interconnected services in the backend but still have them work like a single transaction on the frontend — no easy feat.

To solve this difficult task, we engineered an in-house message queue platform called Vitamin MQ. Vitamin MQ converts transactions as messages for all microservices in a safe and error-proof manner. When an order occurs, Vitamin MQ generates a message, or event, that instructs a delivery request and so on. Transactions that were previously connected are broken up into loosely coupled smaller events, fitting of a microservice architecture. Vitamin MQ not only decouples transactions, but it also improves fault-tolerance, as messages that are not completed are automatically forwarded to the dead letter queue, where they are configured for reprocessing after service recovery.

Conclusion

In this post, we discussed the five major pain points we faced in a monolithic system and the three strategies we developed for a robust and speedy transition to a microservice architecture. Using our Vitamin Framework and other services, our engineers at Coupang develop, test, and deploy at smaller domains, increasing engineering efficiency.

We believe that when a company surpasses a certain point in its business, a monolithic architecture blocks growth. For other companies facing extremely fast growth like Coupang, a transition to a microservice architecture is not only a recommendation, but a must.

Series index

This is part 1 of a two-part series about our transition from a monolithic architecture to a microservice architecture.

Part 1 — How Coupang built a microservice architecture

Part 2 — Integrating platform services to a microservice architecture

If you believe you can contribute to engineering and improving a complex microservice architecture, see our open positions.

--

--

Coupang Engineering
Coupang Engineering Blog

We write about how our engineers build Coupang’s e-commerce, food delivery, streaming services and beyond.