The Challenges of Building a Real-Time Application

Image for post
Image for post
Authors: Jack Maginnes and William Groover

From a business standpoint, the fundamental idea of Savvy is what makes it exciting: everyone online and playing together for one hour each night. From an engineering standpoint, it’s what makes it terrifying.
Real-time applications have traditionally been a technically difficult thing to achieve, particularly at scale. Popular apps such as Trivia HQ and Fun Run have dealt with their fair share of bugs and breaks, and Savvy’s main features rely much on the same ideas. Thus when we set out to build Savvy, we wanted to learn from those who came before us and implement a tech stack we felt would be successful for any number of users. This begins at the code level and extends into our AWS architecture.

It’s no secret that when it comes to realtime, socket.IO is the king. Socket.IO is a javascript library based on websockets which “enables real-time, bi-directional communication between web clients and servers.” It was created specifically with Node.js in mind, and that naturally became the language for our server-side code. Node proves to be the right choice in more ways than just socket.IO, however. For one, it is an exceptionally scalable language, both horizontally and vertically (more on this when discussing the cloud architecture). And, while other languages such as GoLang may be faster, Node has a more active community of developers and a proven track record at large corporations such as Netflix.

Savvy is native iOS, meaning the front-end is written entirely in Swift and Obj-C. Socket.IO provides a library specifically for Swift clients, so the integration between the Node server and the actual application was fairly straight forward from a communication standpoint. With that established, the next step was actually implementing it to serve Savvy’s core functionality.

One of the coolest features of socket.IO is socket rooms. When in a socket room, clients are able to emit messages to everyone else in that room. To break down this idea, let’s use the popular example of a chat room. When you go to most chat room websites, they are split up by topic. If you were to join a topic, say basketball, then you would enter a chatroom with everyone else who is talking about basketball. This means you are able to emit messages to everyone in that room and vice versa. That is exactly how socket rooms work, and in-fact most chat rooms are programmed using them.

In our example, it is likely that when you clicked to join the basketball chat room, your web browser emitted an event the server then processed and ran:


Now that you are in the basketball socket room, you type “MJ is the goat” and click send. Again, your web browser sends this to the server, which processes it and likely runs something like‘basketball’).emit(‘MJ is the goat’);

thus sending your message to everyone in the socket room in real-time.

Image for post
Image for post

While Savvy’s search algorithm and finite state-machine are outside the scope of this blog post, the chat room example is a simplified version of our implementation. When a host searches for a game, a new socket room is created with the name “game/{gameId}”. As the search algorithm picks users to join that game, they join the game’s socket room, thus allowing communication between each of the other players. This also helps the server to update all of the clients about the state of the game, such as “WAITING_FOR_QUESTION,” so that the players’ phone can display the proper screen. All of this makes for a very lean and efficient client-server relationship.

Having the code to provide the real-time nature of Savvy’s experience is not a trivial challenge by itself. Having the infrastructure to back it, especially at scale, is another issue. The ability to support these features demands low latency and instantaneous scalability of the architecture. We leverage several AWS services to fulfill these requirements including EC2, RDS, ElastiCache, Elastic Load Balancing, S3, and CloudWatch.

The combination of EC2 instances, an Elastic Load Balancer, and an Auto-Scaling group allow our servers to launch on-demand to accommodate an unlimited level of traffic across the application. User traffic hits an Elastic Load Balancer, which distributes the requests evenly across the target group containing our servers. These servers are contained in an Auto-Scaling Group, which automatically deploys additional instances of our servers as needed. AWS’s Relational Database Service allows us to vertically and horizontally scale our database instances to deliver low-latency read and write operations, which is critical to ensure games play out as intended. CloudWatch provides our team with crucial visibility into the application’s performance. We use the alarms and metrics features to not only monitor our resource utilization but also to help make informed decisions as we continue forward. ElastiCache allows the socket rooms to be synchronized across the servers. A future blog post will explore that relationship further, as it is complex within its own right.

Image for post
Image for post

If you haven’t already, be sure to check out the app on the Apple App Store:

You can contact us on our website for bug reports, suggestions, or to just say hi:

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store