The evolution of our Android message passing mechanism

How Coupang used LiveData to create a scalable and customizable message bus

Coupang Engineering
Coupang Engineering Blog
9 min readMay 6, 2022

--

By Ju Cai

This post is also available in Korean.

As the Coupang business expanded in the late 2010s, the complexity of our mobile application also multiplied. Growing business requirements resulted in not only a larger codebase, but also a complicated coupling of class dependencies. Fulfilling a new business requirement demanded modifying and merging modules with conflicting logic; a slight modification of a single function affected multiple segments of the code. The process was error prone and simple changes required complete regression tests for reliability.

There was an urgent need to decouple the modules and the pages within each module. To do this, we designed the app’s modules and pages to interact and exchange data through a message passing interface.

Message passing enables simple and direct communication between internal modules based on event data and is crucial to realizing cohesive, modularized, and decoupled development. Furthermore, a message passing interface allows independent functional development and maintenance.

In this post, we will discuss how the message passing mechanism has evolved at Coupang for our Android app.

Table of contents

· Background
·
Modern Period
·
Implementation
·
Usage examples
·
Conclusion

Background

Message passing in the Coupang app has undergone three stages of development that we internally refer to as the Stone Age (early 2012–2018), the Iron Age (2018–2019), and the Modern Period (2019-Present). Throughout these stages, our message passing components have become progressively more elegant. This section briefly introduces the first two phases.

Stone Age

The Coupang app only had five activities, one for each of the following pages: the Coupang home page, the cart page, the MyCoupang page, the product list page, and the search results page.

During this stage, we used Android’s Handler and broadcasting mechanisms for message passing. This phase was named the Stone Age of message communication because it was simplistic. Message tracking was easy: the message was defined as a variable and the caller or responder of the message could be found using the find function of IDE.

However, it had the disadvantage of high coupling and high risk of memory leak. In addition, this system lacked message management and lifecycle perception.

Iron Age

Five activities were far from enough to handle our increasing business needs. Accordingly, we adopted the activity based ViewEventManager, which is like an event bus.

The ViewEventManager architecture we used at Coupang for message passing during the initial stage
Figure 1. The ViewEventManager architecture we used for message passing during the Iron Age.

As shown in the figure above, a View sends an event to the eventSender, which delivers the event to its corresponding eventHandler. Then, the message bus traverses all the registered event handlers, finds the matching event types, and processes them.

However, with accelerating business iterations, we ran into several problems with this approach. First, it was almost impossible to debug, because it was extremely difficult to track messages. All the registered events needed to be carefully examined to find the message responder. This negatively impacted development efficiency.

Second, because the eventHandler held the View, it was highly vulnerable to memory leaks when many events were processed simultaneously. In addition, it did not support sticky message passing and lacked a message management system.

To solve these shortcomings and satisfy the needs of our increasingly demanding business, an entirely new message passing mechanism was developed, transitioning us to the Modern Period.

Modern Period

This section will describe the design and implementation of our Modern Period message passing mechanism, as well as some usage examples.

System design

To support rapid business growth and improve parallel development efficiency, the message passing mechanism had three requirements. The first was rich message type support. The second was tracking capabilities; we wanted to streamline the entire process of message tracking. For example, we wanted to easily find the sender and receiver of a message. Lastly, we wanted lifecycle awareness of message receivers and senders. For example, we didn’t want an inactive message receiver to process a message when its lifecycle was in an inactive stage.

The flow of data in the ElegantLiveDataBus architecture for message passing at Coupang
Figure 2. The flow of data in the ElegantLiveDataBus architecture. ElegantLiveDataBus uses dynamic proxy technology to bind a single message to its corresponding message channel.

To meet all our needs, we proposed and designed a message bus system called the ElegantLiveDataBus as our new message passing mechanism. This mechanism uses the Android LiveData class and has the following features:

  • Diverse message type support. ElegantLiveDataBus supports all types of messages, including support for Java system built-in types, such as string, as well as user-defined types.
  • Constrained but customizable message definition. Messages are defined through an interface, which can establish strong constraints between publishers and subscribers. In addition, messages from the same business can be defined and aggregated in a single interface for easy management.
  • Isolated message channel. ElegantLiveDataBus uses dynamic proxy technology to bind a message to its message channel.
  • Scalable and compatible publisher and subscriber. The publisher uses system API and LiveData provides compatible setValue(T) and postValue(T) methods. The subscriber uses system API and Observer class.
  • Sticky and non-sticky message passing mode. Our system supports both sticky and non-sticky message passing in real-time.

In addition to the features above, our event bus system is safe and lifecycle aware, meaning there is a low risk of memory leak and messages can be received in real-time during the entire lifecycle, starting from onCreate() until onDestroy().

Below you can see a simplistic comparison of the three phases of message passing at Coupang.

Implementation

Let’s take a look at the technical details of implementation on a code level.

Interface

As shown below, the event bus is essentially a HashMap where each LiveData is bound to a channel. The messages defined in the interface are mapped to its appropriate channel through a dynamic proxy.

Using an interface has several advantages. First, errors can be discovered by type-checking during compile time, rather than during the actual message passing process. In addition, restricting the message definition and management through an interface avoids the ambiguities that arise when defining a message using String. Publishers and subscribers may still follow the message specifications defined using String as a post-checking mechanism.

Lifecycle awareness

LiveData is lifecycle aware because its LifecycleOwner and LifecycleBoundObserver are bound together. Refer to the UML diagram of LiveData’s association classes below for details.

The association classes of Android LiveData
Figure 3. The association classes of LiveData

The code for the detailed call and implementation of this binding mechanism can be found below. When we call the observe() method of LiveData, a lifecycle observer is added inside the LiveData object, allowing it to observe the lifecycle of a specific app component.

Non-stickiness

At this point, the ElegantLiveDataBus only supports sticky mode. An observer may receive data that has been sent before subscription, which is not what we want in some cases. By tracking the setValue(T) function of the LiveData, we can find out why this is happening.

The chain of call relationship starts from setValue(T) to dispatchingValue() to considerNotify(). LiveData uses setValue(T) to modify the data version. Then, dispatchingValue() is used to dispatch the changed data value to observers. Finally, the observers use considerNotify() to check whether the changed data value should be processed. From this chain of command, it seems the observer notifies subscribers of only new updates of data in the observed object, ensuring that message notifications are not duplicated.

But why may this observer receive a previously released messages? Each observer has an ObserverWrapper, which has an mLastVersion initialized to -1 on creation. The initial mVersion of LiveData is also -1. When we call the setValue(T) or postValue(T) methods of a LiveData object, its mVersion increases by 1. If a new observer’s mLastVersion is -1 and theLiveData it is observing has a mVersion greater than -1, the LiveData calls observer.mObserver.onChanged((T) mData), sending the previous message to the observer.

To avoid this issue, the version numbers in the LiveData and the registered observer are manually set to the same value.

Preventing data loss

The next issue we tackled was preventing data loss during message passing when using postValue(T). Let’s dig into the code to find the cause of data loss.

Data was lost because postValue(T) stores the incoming data in mPendingData and then throws a Runnable to the main thread. In this Runnable, setValue(T) is called first and then the observer callback is performed. If postValue(T) is executed multiple times before the Runnable, only the temporary mPendingData will be changed and other Runnable objects will be thrown away. In this case, values set later will overwrite the previous ones and event loss occurs.

To prevent data loss, a Runnable is thrown to the main thread every time postValue(T) is called.

Event hierarchical processing

A novel challenge we faced implementing ElegantLiveDataBus came from determining its scope. Imagine a scenario where multiple activities exist simultaneously. If the ElegantLiveDataBus has global scope and all the activities are observing the same message type, all the activities will respond to a new event. This is problematic because we only want the current activity to respond to this event.

There are two common solutions to this problem:

  1. Local scope: the scope of the message bus is defined as local, where the scope is limited to the activity.
  2. Global scope: the scope of the message bus is defined as global, where the scope is the application. In this case, the message is distinguished by the tag of its activity.

However, we found a more elegant solution than the ones above by separating the events and processing them hierarchically. Message monitoring is divided into the categories below and the user can determine which one to use according to product requirements.

  • observeWhenFront(@NonNull Observer<T> observer) observes the events from onResume() to onStop().
  • observe(@NonNull LifecycleOwner owner, @NonNull Observer<T> observer) observes the events from onCreate() to onDestroy().
  • observeSticky(@NonNull LifecycleOwner owner, @NonNull Observer<T> observer) observes the events before onCreate() and from onCreate() to onDestroy().
  • observeForever(@NonNull Observer<T> observer) observes the events from onCreate() to removeObserver().
The various observe methods in the Coupang message passing system that allow users to follow different events, allowing customization
Figure 4. The various observe methods follow different events, allowing customization.

Usage examples

In this section, we’ll examine how to call and use the ElegantLiveDataBus.

Message definition

The basic definition of a new message is as below:

In the Coupang app, each domain follows an activity and multiple fragments. For example, the product detail page has one activity, but the activity has three fragments. This requires the message to be organized into two latitudes: the message response processing page (activity or fragment) and the business involved in the message.

For instance, on the product detail page, a function jumps to the payment page. The response to the jump message is completed in the activity of the product detail page, but the message belongs to the checkout business. Thus, we must define this message as below:

Message registration

An example of code for message registration is shown below.

Message sending

An example of code for sending a message using the postValue(T) method is shown below.

Conclusion

By managing messages through the defined interface of the ElegantLiveDataBus, we greatly improved the efficiency of debugging and enabled hierarchical processing of the message bus. We decreased the time needed for cross-activity interaction development by 50% and reduced the codebase by 60%.

At Coupang, engineers are devoted to improving the efficiency of Android programming through such innovative methods as the ElegantLiveDataBus. In the future, we have plans to examine the development of cross-process message passing using IPC technology.

If you are an Android engineer looking to tackle the large and exciting engineering challenges of supporting a giant e-commerce conglomerate like Coupang, search for opportunities here.

--

--

Coupang Engineering
Coupang Engineering Blog

We write about how our engineers build Coupang’s e-commerce, food delivery, streaming services and beyond.