How we move faster by using Microservices

In competitive video gaming, resilience is a truly notable aspect of any professional gamer. Not every game can be won as effortlessly as SKT1’s World Championships. In any startup environment, you tend to fall down and get back up a lot. With all the lessons learned, this is how we have been able to quickly upgrade our infrastructure and adopt new technologies by using microservices.

We are incredibly excited to share the release of our latest iteration of chat with our customers as communications is a critical component of running esports tournaments successfully. Tournament organizers need to be able to resolve disputes between teams, teams need to be able to communicate with their opponents to schedule their matches, and players need to be able to quickly communicate updates to their team captains in order to let them know if they need a sub.

When we first shipped Battlefy Chat feature, we wanted to ensure that we would be able to measure and monitor all the data that surrounded it. This was done to better understand discourse between our users so that we could create features to foster and simplify the communication and community of our users. Other goals included the following:

  • to analyze communications data (relationships) and see how users were communicating with each other
  • to build on top of a technology with a mature community
  • to start with user-to-user communication and then expand into match-specific conversations
Battlefy Chat 1.0 powered by Neo4j, Mongo in Monolith NodeJS application
Battlefy Chat v1.0 data model in Neo4j

With many lessons learned after having Battlefy Chat 1.0 on production for 3 months, we ended up hitting several performance roadblocks.

While we were incredibly focused on being able to measure the feature from a product perspective, we ended up shipping a feature that worked well in one use case but didn’t work so well in other use cases (match chat, 10 people in one conversation, many messages, with many concurrent users on the site). Not only that, but it was also implemented into our monolithic application. Scaling was a nightmare.

  • As part of a monolith, scaling chat was difficult on larger tournaments, we would only have the option of spinning up more servers
  • When bottlenecks occurred from either Neo4j or Mongo, whole entire application servers would be affected which led to outages

As we blogged previously, we use Docker as part of our infrastructure. This makes it easier for us to strangulate our communications component into a separate microservice. Originally, chat was part of Anduril (monolith) and as part of making sure that we could scale chat independently of our API servers, we strangulated it out.

The new version of Battlefy Chat has a separate deployment pipeline, separate git repository, uses ES6 and NodeJS 4.2 (LTS), as well as, uses CircleCI (usually we use Jenkins) and ESLint. In order to keep up with the NodeJS community, we continuously evolve our tooling by strangulating out a lot of our business capabilities into separate microservices. Each microservice is maintained by a different team and has an incredible amount of flexibility regarding adopting different technologies as they aren’t limited by existing technical debt or legacy code.

A huge thanks goes out to the product team for not only exemplifying engineering excellence by upgrading our infrastructure to make it easier for the platform to adopt microservices, but also shipping a massive enhancement to our communications capabilities that our customers deeply care about. Furthermore, this is why Battlefy is extremely proud of the work done by Ronald Chen (@pyrolistical), Jared Daley (@jsdaley), Feng Wu, and Justin Wong (@justinzwong) for shipping Chat 2.0.

Jaime Bueza
Software Engineer