By Yao-Hui Chua
We serve millions of buyers and sellers at Carousell. As a C2C marketplace, our users actively chat with one another before committing to a transaction.
We care about these conversations because they represent the lifeblood of our community. To support these interactions, we’ve gathered a team of engineers and designers to deliver a reliable messaging experience.
As part of this team, I rebuilt our chat frontends and picked up valuable lessons along the way.
Building a real-time application from scratch isn’t exactly straightforward. Without the right patterns in place, you’ll run into lots of errors.
In this article, I’ll share some practical tips on how you can 1) build a reliable chat client and 2) craft an intuitive chat UI.
Let’s walk through the architecture of a chat frontend. Chat UIs consist of two kinds of data entities:
- Channels, which represent the conversations between users
- Messages, which represent the content items inside those conversations
These channels and messages are updated via various sources, such as user-driven actions, real-time socket events, and chat APIs.
If you opt for a Redux-based architecture, the data flows in your application will look something like this:
We’ll explore how we can improve aspects of this model to strengthen your application’s reliability.
1. Building a Reliable Chat Client
To increase the likelihood of your application working as expected:
- Define your schemas first, then write your business logic
- Minimise complexity with suitable abstractions
- Withstand network failures with local fallbacks
1a. Define your schemas first, then write your business logic
One of your first goals will be to outline the schemas of your chat entities. Chat messages, in particular, can come in a range of shapes.
You might have to juggle between:
- Administrative messages (sent by your system)
- Image messages (sent by users)
- Plain text messages (also sent by users)
Without well-defined schemas, you’ll have trouble identifying the fields available to each of these subtypes.
Flow, a static type checker from Facebook, allows you to specify how your data entities are composed and enforce type safety in your operations.
Cannot read property 'x' of undefined). Flow steers you away from such mistakes by deducing the referenced subtypes in each code path:
If you pass an
renderUserMessage(), Flow recognises it as a type violation and alerts you:
Although Flow provides you with an
1b. Simplify your operations with suitable abstractions
A high degree of type safety will help you avoid mistakes, but it isn’t going to make your application bug-free.
Complex data operations, when mismanaged, lead to race conditions. Consider what happens when a user types a message. You’ll need to:
- Notify the other party that the user is currently typing
- Ping them again once the user has stopped
In concrete terms, you’ll have to simultaneously:
TYPING_STARTEDevents at throttled rate
- Fire a
TYPING_STOPPEDevent once the user hits “Send” (or once they’ve stopped for long enough)
If you don’t manage these concurrent tasks well, you might mistakenly trigger a
TYPING_STARTED event after a
TYPING_STOPPED one. This makes the typing indicator linger for longer than it should:
To prevent such behaviours, you’ll want to cover your code with tests. However, it’s also important that you adopt the right tools to keep your operations easy to reason about.
Redux-Saga, a Redux middleware library, fills this role suitably.
An alternative side effect model for Redux apps. Contribute to redux-saga/redux-saga development by creating an account…
Redux-Saga’s API enables you to implement concurrency patterns with ease. To handle typing events, we could do:
We don’t have to worry about the side effects of one task overriding that of another’s;
race cancels all remaining tasks the moment one of them resolves.
Redux-Saga shines in other areas too. As mentioned earlier, your chat client sends and receives real-time updates via a persistent socket connection. With the
eventChannel API, you can propagate incoming events to your store:
1c. Withstand network failures with local fallbacks
No matter how well you’ve organised your data operations, your users will run into network failures. Such problems can be frustrating, especially for people who live with intermittent connectivities.
You can anticipate such failures in a variety of ways. One mechanism I recommend is to store failed messages locally and let users retry manually.
Let’s add this fallback with the tools we’ve seen so far. First, use Flow to update your type definition for
UserMessage. You’ll want to distinguish between
pending state to your
UserMessage is beneficial as it indicates to your users that their message is being sent. However, displaying this state immediately has a rickety effect:
You can avoid this behaviour by delaying the
pending status. Introducing a delay isn’t entirely simple though—your request will often get a response even before the delay timer runs out!
Here’s the control flow for sending messages (and storing them if the network fails):
You can translate the above into a saga function of its own:
Once again, Redux-Saga excels at helping you manage your side effects. You can
cancel ongoing tasks that haven’t been resolved and use
put to propagate Redux actions to your store.
In case you’re curious, the logic for retrying messages is much simpler to write.
When testing offline flows, turn off your Wi-Fi. Network throttling via your browser’s developer tools is likely to preclude socket connections.
2. Crafting an Intuitive Chat UI
At this point, we’ve mostly discussed how you can limit errors in your business logic. Let’s shift our attention towards UI-centric challenges.
Be it WhatsApp, Messenger, or Slack, all modern chat applications have a fairly standard set of UI requirements. In this section, we’ll go over typing indicators, chat boxes, and scrolling behaviours.
2a. Typing Indicator
We’ll start with the tiniest of the lot: the typing indicator.
The main challenge here consists in making the dots bounce at the right rhythm. You’ll notice that the dots spend more time resting than moving.
Use CSS keyframes to control their rate of movement:
2b. Chat Box
Situated below the typing indicator is the chat box — the input element in which users type their text messages.
You can think of chat boxes as variable height
<textarea> elements which:
- Expand and shrink according to size of their text content
- Become scrollable once their text content exceeds a certain height
<textarea> elements with a dynamic height:
Note that when the user switches between channels, the chat box should automatically be in focus, so that users can start typing immediately:
2c. Scrolling Behaviours
Scrolling behaviours aren’t easy to handle. In fact, implementing them in a consistent manner might just be the most tedious task you’ll face.
Your first roadblock is cross-browser compatibility. When rendering messages from bottom-to-top, your immediate instinct might be to use
flex-direction: column-reverse, since that inclines your scroll view to the bottom. Unfortunately, Firefox disables scrolling with this setting.
On the other hand, using
flex-direction: column adds complexity. For starters, you’ll need to flush your scrollbar to the bottom. Browsers such as Safari don’t support scroll-anchoring, which makes it hard for you to maintain your scroll view when older messages are prepended.
To accommodate browser inconsistencies, you might consider using a variable layout. If a browser provides scroll-anchoring, use
flex-direction: column. Otherwise, use
flex-direction: column-reverse (and
.reverse() your messages in your view logic for the same output):
Having a variable layout raises code maintainability costs, but it lets you take advantage of
column-reverse’s bottom-to-top ordering on selected browsers. This approach has worked fine for us at Carousell so far, but your mileage may vary.
The second hurdle is repeated loading. Regardless of the active
flex-direction, bringing the scrollbar to the top can sometimes trigger an endless cycle of requests:
When your scroll view is at the very top (i.e.
.scrollTop === 0), the browser (even with scroll-anchoring enabled) has no idea that it needs to shift down to adjust for prepended messages:
Thus, you’ll need to add your own mechanism for tethering your scroll view to the original topmost message:
You can calibrate your scroll view by setting
.scrollTop to be the distance between your anchor element and the top of the scroll container:
The third obstacle relates to elements which trigger reflows when loaded. Your scroll-anchoring measures may not be effective if DOM updates occur after you’ve calibrated the
.scrollTop value. This issue typically arises when you work with image content:
There are several good solutions to this problem, but most straightforward one simply involves preloading your messages. This works even if you don’t know your image dimensions in advance:
In this context, preloading means forcing your browser to retrieve all image sources before prepending earlier messages to the DOM. Do ensure that your cache directives have been set to minimise repeated round trips.
Given the breadth of what we’ve covered, it would be remiss of me to not include a working demo! Feel free to check out how variable layouts, scroll calibration, and preloading come together in the sandbox below:
We’ve explored reliability on multiple levels — from type safety, to data safety, to resilience against network failures. We’ve also examined critical aspects of chat UIs, such as typing indicators and scrolling behaviours.
At Carousell, our work on the chat experience is far from done. We’re excited to go further with drag-and-drop uploads, cached messages, and push notifications. I might write about these topics someday.
Feel free to reach out if you have any questions. Thanks for reading!
Many thanks to my colleagues from the chat reliability team (Dave Luong, Diona Lin, Harshit Shah, Hui Yi Chia, Jason Liu, Jason Xu, Josh Humber, and Rita Wang) for tirelessly applying themselves to this problem domain. Insights from other functions (e.g. backend, iOS, and Android) have frequently contributed to the front-end web experience.
I’m also grateful to the web team for their support, especially the platform engineers (Bang Hui Lim, Stacey Tay, and Trong Nhan Bui) who defined our best practices early on. Their work has made it a pleasure for me to build this feature for Carousell.