Stream Highlights is a self-operating clip generator based on real time data of the Twitch IRC chat.
I developed this project following three key principles:
- Fault Tolerance
Having a robust system while allowing fault tolerance might seem counterintuitive. I believe that by applying a rigorous separation of concerns, interwoven with no single points of failure, my application would guarantee its uptime.
The architectural big picture is presented below
This architecture is split into two parts:
- The web application, composed of a rendering server, a web API, and a document-based database
- The Detection System, composed of Node services, a time series database and a stream engine system
We will start by the web application, where we will review the architecture, its pros and cons, finishing with its potential scalability.
The second part will follow the same plan, with a focus on the detection system.
I — Web Platform
The platform is composed of three entities: a rendering server, a web API and a document-based database.
The rendering server is an entity responsible for receiving client requests, proxying them to the API if necessary, and building the correct assets for the client. Technology-wise, it is using Node, Express, Server Side React & Redux.
The Web API (named Stream Highlights API) is a simple Express API using mongoose to retrieve data contained in our MongoDB database. MongoDB is our persistence layer responsible for aggregating all our data.
a — Separating the rendering server from the API
I followed the web standard in developing my application, however, it has two specificities: a standalone rendering server (using React Server Side rendering) that proxies the requests to the API. Let us have a look at those two architectures.
The first choice to make our application scalable was to split the rendering server from the web API. This choice was made to ensure that we could allocate resources on the fly depending on the demand. It is mainly a scalability choice.
Given a significant charge on our application, if we notice that the rendering server is getting slower, we can allocate more resources without diluting an already fast web API. The reasoning is quite the same for the API if we find that it is a bottleneck on our flow.
b — Proxy & Non Proxy Architecture
The second point I want to tackle is the usage of a proxy to redirect user’s requests to the API through the rendering server. Let us have a look at those two architectures.
The proxy architecture is the one we are using right now. When a user makes a request to the API (on the client side), the request is actually proxied to the API via the rendering server. The client is not directly contacting the API.
For the non proxy architecture, the rendering server contacts the API on the first request. On the client-side, the application directly contacts the API.
I chose to use a proxy architecture mainly for app scalability reasons. The non proxy architecture is completely valid; and our platform would perform in the same way with it. However, I wanted to make sure that the platform could scale to user authentication easily.
User authentication would enable account creation, profile maintenance, favourite handling and so on. Therefore, protected resources on the API should only be accessed by one user. The user would go on a complete authentication flow, most likely handled via cookies. This is where we have our bottleneck.
If our user is making direct requests to the API on the client side, it will be authenticated and be delivered a cookie for its session. The cookie will be sent to the API on every subsequent requests to the API. If a user asks for a protected route on the server, the cookie will not be passed to the rendering server. This is where we have our issue. Even if the user has authenticated through the API, he will be denied protected resources if he makes a request through the rendering server.
This is the reason why we use the proxy architecture: to fool the user into believing that it uses the rendering server to authenticate, but under the hood, it is not exactly true.
II — Performances
Let us have a look at the overall performance of the architecture. Here is a picture of all the networks requests performed when accessing Stream Highlights.
I have circled in red the time to first byte of the application, which is around 350 ms. The time to interact is quite longer, given the fact that the application is re-rendered on the client side when receiving the bundle. Let us have a look at the API requests on the client side.
Ranging from 25 to around 700 ms, the API is quite fast.
III — Optimizations
An architecture can always be improved depending on the needs and the bottlenecks it is facing. For our architecture, two points come to mind: caching usage and bundle optimization.
If we are facing more and more requests to our API, we could use a caching system to ensure that all requests are not actually fetched from the API but rather from a caching database such as Redis. The architecture would look like this.
On the first request, the rendering server would proxy the request to the API, but data would be stored into Redis for subsequent requests. The TTFB as well as the TTFI would drop heavily as data would mainly be fetched from the caching database.
A second optimization would be to leverage browser caching, as well as minifying assets we are delivering to the client. Such improvements are mainly made possible using Webpack and advanced techniques such as code splitting and lazy module loading.
IV — Summary
This first part focused mainly on the web application architecture and its two main components: separating the rendering server from the API and proxying requests through the rendering server. We have benchmarked our architecture performance and seen optimizations that could be applied to it.
Jump to the second part to have a look at the data processing architecture!