We were wrong about HTTP & WebSockets — here’s what we learned

Sharon Grossman
Feb 18 · 5 min read

At Soluto, we’ve created AnywhereExpert — which at its core, is a high-scale rich messaging platform, serving around 80 million customers worldwide. As the number of users grew larger, we noticed an increase in the number of messages that are simply not reaching their destination. To tackle this issue — we embarked on a mission.

What’s in it for me?

  • Are you planning to build a messaging system?
  • Curious to discover new of a large-scale platform?
  • Interested in the of a codebase serving of customers?

👻

The Goal — Reliability

The only symptom of a failed message delivery was the number of times the Resend Button has popped — the Yellow (!) Indicator.

An indication to the customer, that a message has failed to deliver

It became the goal & symbol of our task — to reduce the number of Yellow (!) indicators to 0, to WebSockets compared to an HTTP based services?

We wanted to know when a message fails to deliver, and why.

WebSockets for the win, or are they? 🙈

Send & Receive data using a single socket

The application emits a Socket event to the server, and the server in return emits a Socket event to our Socket room. Classic.

Back to our case, we could not find the root cause of errors both in our service, that emits the WebSocket events, and the custom client that our Frontend applications use. Did we have subpar tooling for WebSockets compared to an HTTP based services? Perhaps.

But

WebSockets are a side effect.

Wow! That’s a bold statement, isn’t it? What happens when you have multiple clients? external APIs, tons of flows & events, bots? Do they send messages via WebSockets?

This debate ultimately led us to use an event-driven pattern we’ve implemented in other critical services — tackle the side effects of creating a message, and ensure our platform is flexible and scalable enough to add components and flows.

The Pattern — HTTP, WebSockets & friends 👪

  • CRUD Operations are done via simple REST API, using HTTP.
  • Socket Server listens to events, and emits data to the connected sockets.
  • Use Kafka for robust communication between those services using events.
  • Scale the services correctly for traffic, and to reduce code mess and dependencies.
  • Trigger multiple side effects as much as you like, by listening to the message event anywhere in your flow.
  • Let’s name them for simplicity:
The event-driven pattern for real-time flows

What?! Use HTTP for real-time operations?!

HTTP vs WebSockets — I wish it was a decisive battle, but this argument made it difficult for us to convince other developers of the validity of our solution. WebSockets are arguably better in real-time performance when it comes to web applications, so we’ve had our concerns about using HTTP in some parts of our architecture. Perhaps the better solution was to improve our tooling and code quality around WebSockets, and not turn away from it.

So why? Because honestly, in real life it looks like this

  • One way or another, we wanted to separate our HTTP and WebSockets traffic, in order to scale the services correctly.
  • The effort of refactoring & improving our tooling around WebSockets was far greater than implementing the new architecture.
  • By using this pattern, we created an event-driven flow of data, that begins by utilizing HTTP’s reliability in CRUD operations, and WebSocket’s fast emission back to the client.

How about an example? ✍🏻

  • Frontend Application uses a generated HTTP client to create a message; meanwhile it handles optimistic rendering and further UI changes
await Api.create(message);
  • api receives the data, saves it in a database and produces to Kafka
public async create(
id: string,
payload: Payload
): Promise<Data> {
const data = await createAndSave(id, payload);
await produceToKafka({data, eventType: EventType.Created});
return data;
}
  • live-api consumes an event, and emits to a socket room
switch (topic) {
case 'event':
const {eventType, id, data} = payload as EventPayload;
emitToRoom(eventType, id, data);
break;
}
  • live-api receives an event on a socket room, and emits a response with data to that room
socket.to(id).emit(eventType, data);
  • Frontend app gets incoming data & handles UI changes and —

The Results 🙌🏻

We were surprised to see the number of undelivered messages (Yellow (!) Indicators) . The experience itself remained smooth, and the customers themselves reported a much stabler and even faster platform.

What about the failed messages now?

Tracking and fixing errors was a much lesser task, as the code was simpler, easier to monitor and log, and less coupled.

Number of messages that failed to deliver (Yellow ! Indicator)

What did we learn from this journey? 🤔

  • Keep your APIs lean. Leverage to create reactive and decoupled pieces of code that make sense.
  • are bound to be broken. Even though WebSockets excel at real-time performance, we found out that by using it only where it’s needed, and replacing some parts of it in HTTP, actually helped our platform.
  • The of WebSockets and HTTP into different services gave us the ability to scale correctly and find errors on each moving part of the app.
  • Trying to fix a broken thing could be a far more complex mission than creating and of existing patterns that work.
  • in an agile environment, it’s important to take them on, and consider both success and failure.
  • Improving your , as an engineering team, and as a company, is a part of growth and success.

As a side note, our tech stack in this case is: Node.js, React + Mobx, Socket.IO +Redis, Kafka, MongoDB, on top of Kubernetes clusters.

Soluto by asurion

Engineering. Product. UX. Culture.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store