Stream Highlights — Architecture & Implementation

Antoine Solnichkin
Dec 1, 2018 · 5 min read
Deep diving into Stream Highlights general architecture

Stream Highlights is a self-operating clip generator based on real time data of the Twitch IRC chat.

I developed this project following three key principles:

  • Robustness
  • Scalability
  • Fault Tolerance

Having a robust system while allowing fault tolerance might seem counterintuitive. I believe that by applying a rigorous separation of concerns, interwoven with no single points of failure, my application would guarantee its uptime.

The architectural big picture is presented below

Stream Highlights General Architecture

This architecture is split into two parts:

  • The web application, composed of a rendering server, a web API, and a document-based database
  • The Detection System, composed of Node services, a time series database and a stream engine system

We will start by the web application, where we will review the architecture, its pros and cons, finishing with its potential scalability.

The second part will follow the same plan, with a focus on the detection system.

I — Web Platform

The rendering server is an entity responsible for receiving client requests, proxying them to the API if necessary, and building the correct assets for the client. Technology-wise, it is using Node, Express, Server Side React & Redux.

The Web API (named Stream Highlights API) is a simple Express API using mongoose to retrieve data contained in our MongoDB database. MongoDB is our persistence layer responsible for aggregating all our data.

a — Separating the rendering server from the API

I followed the web standard in developing my application, however, it has two specificities: a standalone rendering server (using React Server Side rendering) that proxies the requests to the API. Let us have a look at those two architectures.

Splitting Rendering Server & API

The first choice to make our application scalable was to split the rendering server from the web API. This choice was made to ensure that we could allocate resources on the fly depending on the demand. It is mainly a scalability choice.

Given a significant charge on our application, if we notice that the rendering server is getting slower, we can allocate more resources without diluting an already fast web API. The reasoning is quite the same for the API if we find that it is a bottleneck on our flow.

b — Proxy & Non Proxy Architecture

The second point I want to tackle is the usage of a proxy to redirect user’s requests to the API through the rendering server. Let us have a look at those two architectures.

Proxy vs Non Proxy Architecture

The proxy architecture is the one we are using right now. When a user makes a request to the API (on the client side), the request is actually proxied to the API via the rendering server. The client is not directly contacting the API.

For the non proxy architecture, the rendering server contacts the API on the first request. On the client-side, the application directly contacts the API.

I chose to use a proxy architecture mainly for app scalability reasons. The non proxy architecture is completely valid; and our platform would perform in the same way with it. However, I wanted to make sure that the platform could scale to user authentication easily.

User authentication would enable account creation, profile maintenance, favourite handling and so on. Therefore, protected resources on the API should only be accessed by one user. The user would go on a complete authentication flow, most likely handled via cookies. This is where we have our bottleneck.

If our user is making direct requests to the API on the client side, it will be authenticated and be delivered a cookie for its session. The cookie will be sent to the API on every subsequent requests to the API. If a user asks for a protected route on the server, the cookie will not be passed to the rendering server. This is where we have our issue. Even if the user has authenticated through the API, he will be denied protected resources if he makes a request through the rendering server.

This is the reason why we use the proxy architecture: to fool the user into believing that it uses the rendering server to authenticate, but under the hood, it is not exactly true.

II — Performances

Network requests performed

I have circled in red the time to first byte of the application, which is around 350 ms. The time to interact is quite longer, given the fact that the application is re-rendered on the client side when receiving the bundle. Let us have a look at the API requests on the client side.

Client Side Requests

Ranging from 25 to around 700 ms, the API is quite fast.

III — Optimizations

If we are facing more and more requests to our API, we could use a caching system to ensure that all requests are not actually fetched from the API but rather from a caching database such as Redis. The architecture would look like this.

Caching Architecture For Node Apps

On the first request, the rendering server would proxy the request to the API, but data would be stored into Redis for subsequent requests. The TTFB as well as the TTFI would drop heavily as data would mainly be fetched from the caching database.

A second optimization would be to leverage browser caching, as well as minifying assets we are delivering to the client. Such improvements are mainly made possible using Webpack and advanced techniques such as code splitting and lazy module loading.

IV — Summary

Jump to the second part to have a look at the data processing architecture!

devconnected — DevOps, Sysadmins & Engineering

Tutorials & Guides for DevOps, sysadmins and software engineers.

Antoine Solnichkin

Written by

Software Engineer | Technical Writer @ devconnected | Into DevOps — System Administration — Open Source

devconnected — DevOps, Sysadmins & Engineering

Tutorials & Guides for DevOps, sysadmins and software engineers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade