The Backend Tech Stack at FanAI

Published in

FanAI Engineering

5 min readJun 9, 2020

Overview

FanAI is a company built around data, loads of it. In this post, we’ll discuss the infrastructure we’ve put in place to handle big data workloads and APIs in a scalable, resilient way.

Cloud Platform

Early on, we chose to build on Google Cloud Platform (GCP) for the following reasons:

GCP is unapologetically Kubernetes native. We knew we wanted to build on top of Kubernetes and the Google Kubernetes Engine (GKE) is the industry standard for managing Kubernetes. GKE is the only managed Kubernetes that provides automated master and node upgrades and it integrates in with Stackdriver (for logging and monitoring) by default.
GCP is great for dealing with big data. It comes with arguably the best data warehousing solution on the market — Big Query.
GCP integrates seamlessly with Firebase / Firestore, which we prefer to use on our frontend for authentication and other user-centric functionality.
GCP integrates seamlessly with the Google Suite, which makes it really easy to set up IAM rules and polices that just fit with your organizational structure and roles. Security and privacy of data is extremely important to us, and we’re a small team, so it saves a lot of headaches to have this tight integration

Python Monorepo

Historically, we were building our platform in more of a SOA (service oriented architecture), but in the last year have moved towards a monorepo for improved productivity and developer experience. With a small team, and no dedicated devops or infrastructure people yet, we realized that there was just too much setup and boilerplate involved to warrant maintaining multiple repos.

Having said that, it’s important to note that our platform is not monolithic. There are clear boundaries and responsibilities for each service and we have the ability to deploy each service independently of the others, even though they are in the same code repository. The monorepo allows us to easily share code amongst individually deployed applications. These individually deployed applications, however, do not import from each other, but rather from a shared core python module in order to prevent coupling between the individual releases. This will allow us to easily break them out into individual repos if we ever want to in the future, like the case where once we grow as a company and have the resources to devote single teams to each release.

We try to stay on the cutting edge of stability with Python. We’re currently running Python 3.7.

Kubernetes

As I mentioned previously, 90% of our backend runs in Kubernetes. We utilize GKE (Google Kubernetes Engine) to manage the underlying GCP Kubernetes infrastructure for us and we define / manage groupings of Kubernetes resources using Helm.

We define a lot in GKE, including workloads for APIs, ETLs, and databases, and ingress into our services. We’ve moved away from standing up databases in Kubernetes, however, preferring to utilize managed storage services from GCP. We’ll talk more about that in the Storage section below.

APIs

We have a set of RESTful APIs that we’ve written in Python and stood up in Kubernetes. We utilize the hug API framework to enable faster API creation utilizing annotations, easy versioning, and painless documentation. Hug is also one of the faster API frameworks, according to Pycnic.

ETLs / Data Gathering

We also have a set of ETLs written in Python, stood up in Kubernetes, and glued together via PubSub. We have 3 core ETLs that are providing data to FanAI data scientists, analysts, and the frontend 24x7:

Twitch
Twitter
FullContact (demographic enrichment)

We’ll typically define each piece of the ETL (extract transform load) as its own workload that is either kicked off periodically via a schedule / cron or event-driven, and kicked off via its subscription to some PubSub topic.

CI/CD

We currently just maintain 2 environments: staging and production. We use Google Cloud Build as our build server.

We run a number of checks and linting in a pre-commit hook including (but not limited to):

Linting via Flake8
Bandit for security issues

Then, for each pull request, we run a number of automated checks and tests including (but not limited to):

Unit tests
Functional tests

Upon code being merged into our develop branch, we build a Docker image using previously built images as a cache to reduce build time, and run all of our functional, unit, and integration tests on the container. We then automatically deploy to our stage environment and kick off a set of Postman API regression tests. This is the final check. If the API tests fail, we roll back the release and any database migration that was performed.

Storage

We utilize a number of different storage technologies to meet our storage, security, and privacy needs as a big data analytics company.

ArangoDB

Arango is a multi-model database which we utilize for its graph database capabilities. We prefer a graph database to model the relationships between all sorts of entities in our universe. For example, a player can be on a roster, a roster could be part of a team, a team could be in a league, and so forth. We also model the relationships between brands and sub-brands and how they’re categorized.

ElasticSearch

With all the entities in our graph DB, we needed a fast way to search through them, so we indexed them in ElasticSearch and provide APIs for our clients to quickly find the entity they’re looking for.

Postgres

We utilize Postgres as our main transactional DB. Historically we had stood up Postgres within Kubernetes, but decided in the last year that the operational overhead wasn’t worth it with our small team, and now rely on Google’s CloudSQL managed Postgres.

Redis

We utilize Redis primarily for caching the results of web requests and queries that take a long time to complete, in order to improve the performance of our system. Similar to Postgres, we used to stand up Redis inside of Kubernetes, but have migrated to using Cloud MemoryStore.

Cloud Storage

We utilize Cloud Storage primarily as a means to ingest data from our clients in a secure manner and ensure that it is segregated from other clients’ data. Cloud Storage gives us the ability to create signed URLs to ensure only select clients are able to upload data to us and they can upload their data directly to Cloud Storage rather than passing it through our API layer first.

Closing

Thanks for reading. We’ll surely have follow on posts that go into more detail on all the different parts of our big data platform. If there’s anything you’re curious about, feel free to hit us up. We always love to talk shop.

And one more thing: If you are a strong backend developer that would enjoy working on an exciting sponsorship performance analytics platform, send us your resume!