At Wildlife, we started making synchronous multiplayer games in late 2015, early 2016. Back then, most of our backend systems were written in node.js for several reasons. It has good performance and scalability, especially when compared to other languages such as Java or Python. It was the programming language our software engineers were most proficient in, and, most importantly, it was the trending programming language at the moment.
As a result, there were several good open-source projects that we could use to build our systems. One of these projects was Pomelo, “a fast, scalable game server framework for node.js” as written in their readme on Github.
But, why do we need a game server framework? In a very simplified manner, for a complex game, we need different types of servers.
We could call them (not-so-micro) services. Each of these services (or server types) has different responsibilities and needs to scale in an independent way. For example, we have connectors which are the servers to which the players (or their phones) are connected and which handle basic operations such as authentication and authorization. We have the game servers where the matches (and the magic) actually happen. We have metagame servers where the other operations in the game happen, such as card upgrades, fetching the leaderboards, etc.
The demands for a game server are quite different from a metagame server and we need all of them to scale up and down horizontally, given that we operate with a very large demand which oscillates with special dates, feature releases and at different times of the day.
One big responsibility of the framework is to make sure we can orchestrate all these different servers in a transparent and efficient way. Servers need to know which other servers exist, independent of their type, and also need to be able to communicate with each other, often performing operations in other servers. We call these features service discovery and RPCs (remote procedure calls).
Another main responsibility of the framework is that the games (clients) need to connect to the servers. These connections are typically long-lived and have strict low latency requirements. It’s also very important to be able to broadcast messages to several players that are playing the same match.
Pomelo really simplifies both responsibilities and was a great tool for us to launch our first synchronous multiplayer games. But the framework was not kept up-to-date by its maintainers and used a very old node.js version (yes, callback hell!). As we found bugs, we sent over PRs but as the code was unmaintained it took a very long time for these changes to be merged if they ever were.
Most importantly, eventually, some of its bugs or design choices didn’t allow it to meet our demands at the scale we needed. With too many servers, the broadcast of ping messages used to indicate whether a server was alive or not was enough to overload the servers. It was also very difficult to handle RPCs if the servers were in different regions.
As Pomelo reached its limits for us, we started to think about what we should build next. One of our values is: we innovate with research. It means that we work hard to understand what works and what doesn’t, before committing to doing something new. Because we used Pomelo as the framework for our game servers for several years we knew it extensively: its strengths and its flaws. This enabled us to keep the good, change the bad, and improve what could be better.
Go takes place
In early 2018 we had already been using Go as the main language for the backend systems for quite a while.
The move from building services in node.js to Go, in mid-2016, was mostly based on Go’s awesome concurrency mechanisms that are key for building the scalable systems for our millions of players. Also, since we have a lot of servers, low resource usage is a very relevant feature as it reduces our infrastructure costs and complexity.
It was then a perfect choice for us to build Pitaya, Wildlife’s own scalable game server framework.
From Pomelo we kept the distributed design and protocol for client-server communication. The biggest changes we made were in the approach to service discovery and RPCs because they were the source of our main production issues in the games that used Pomelo.
In terms of additions, we wanted Pitaya to have built-in observability features, so we made it support OpenTracing compatible frameworks such as Jaeger. It also has support for Prometheus and Statsd already implemented. This is one of the best things about using Go. Because we were so immersed in its community and amazing open-source projects it not only helped us to create a more efficient framework but also pushed us to use other open-source projects written in Go such as etcd and NATS.
About Pitaya’s success and adoption inside Wildlife, the framework is being used in production in our main games for a few years now and has helped us successfully launch some of the most amazing games we’ve done so far like Tennis Clash and Zooba.
Our Game Engineers are definitely happier as we also wanted to create related projects that would make the development of projects using Pitaya easier. One example is pitaya-bot, a framework with bots for integration and stress tests. Our SRE and Infrastructure engineers are happier because of Go’s low memory and CPU footprint and its amazing tooling for profiling, pprof.
For us, it was very important to build Pitaya as an open-source project so other people can also build amazing projects using it. It was the logical choice after we have benefited so much from another amazing open-source project in the past. As a bonus, we can have other people to help us improve the framework by contributing to the project.
Hopefully, this post explained why we built Pitaya and why Go was the right choice when doing it. In the next post, we’ll explain in-depth some of Pitaya’s features and technical decisions.
If you love building challenging new projects and Go as much as we do, you could be a part of our team! Take a look at our available positions and we hope to meet you soon!