How I built an app with 500,000 users in 5 days on a $100 server

Erik Duindam
Jul 18, 2016 · 10 min read

There seems to be a general consensus in the world of startups that you should build an MVP (minimum viable product) without caring too much about technical scalability. I’ve heard many times that the only priority is to just get the product out there. As long as your business model would work at scale, you’re good. You shouldn’t waste time and money on making a technically scalable product. All you worry about is testing your assumptions, validating the market and gaining traction. Scalability is a concern for later. Unfortunately, this somewhat blind belief has led to some terrible failures. And Pokémon GO reminded us of it.

One person who won’t make this mistake again is Jonathan Zarra, the creator of GoChat for Pokémon GO. The guy who reached 1 million users in 5 days by making a chat app for Pokémon GO fans. Last week, as you can read in the article, he was talking to VCs to see how he could grow and monetize his app. Right after, GoChat went down. A lot of users lost and a lot of money spent. A real shame for a genius move.

The article states that Zarra had a hard time paying for the servers that were necessary to host 1M active users. He never thought to get this many users. He built this app as an MVP, caring about scalability later. He built it to fail. Zarra hired a contractor on Upwork to fix a lot of performance issues. The contractor stated that the server costs were around $4,000. Since my calendar says it’s 2016, I assume he isn’t talking about $4000 of hardware, but $4000 in monthly or yearly virtual server and traffic costs.

I’ve been designing and building web platforms for hundreds of millions of active users for most of my career. I can say $4,000 is a totally unnecessary amount of money for 1M users in a chat app. Even for an MVP. It means the app’s server tech was designed poorly. It’s not easy to build a cost-efficient, scalable system for millions of monthly users. But it’s also not terribly complicated to have some sort of setup that can handle at least a decent amount of users on some cheap servers in the cloud. You just have to take it into account when building the MVP, by making the right choices.

GoSnaps: 500,000 users in 5 days on $100/month server

Technical comparison of GoChat and GoSnaps

Unique to GoChat is that it had to fetch and post a lot of chat messages every second. The article about GoChat talks about 600 requests per second for the whole app. Those 600 requests are a combination of map requests and chat messages. These chat messages are small and could/should be done over a simple socket connection, but happen often and have to be distributed to other chatters. This is manageable with the right setup, but disastrous with a poor, MVP-like setup.

GoSnaps, on the other hand, has a lot of images being fetched and ‘liked’ every second. The snaps pile up on the server, since old snaps stay relevant. Old chats do not. Since the actual image files are stored in the Google Cloud Storage, the amount of requested image files is not a concern for me as a developer. Google Cloud handles this and I trust Google. But the requested snaps on the map are my concern. GoSnaps has image recognition software that looks for patterns on all uploads to see if an image is Pokémon-related or not. It also resizes the images and sends them to Cloud Storage. These are all heavy operations in terms of CPU and bandwidth. Way heavier than distributing some small chat messages, but less frequent.

My conclusion is that both apps are very similar in terms of scalability complexity. GoChat handles more small messages while GoSnaps handles larger images and heavier server operations. Designing an architecture for these two apps both require a slightly different approach, but are similarly complex.

How I built a scalable MVP in 24h

Let’s say I would consider an MVP as solely a race against the clock to build a functional app as quick as possible, regardless of technical backend quality. Where would I have put my images? In the database: MongoDB. It would require no configuration and almost no code. Easy. MVP. How would I have queried the snaps within a certain area that got the most likes? By just running a plain MongoDB query on the entire pile of uploaded snaps. Just one database query on one database collection. MVP. All of this would have destroyed my app and the app’s feature.

Look at the query I would have had to run to get these snaps: “find all snaps within location polygon [A, B, C, D], excluding snaps marked as abuse, excluding snaps that are still being processed, ordered by number of likes, ordered by valid Pokémon GO snaps first and then ordered by newest first”. This works great on a small dataset, great, MVP. But this would have been totally disastrous under any type of serious load. Even if I would have simplified the above query to only include three conditions/sorting operations, it would have been disastrous. Why? Because this is not how a database is supposed to be used. A database should query only on one index at a time, which is impossible with these geospatial queries. You’ll get away with it if you don’t have a lot of users, but you’ll go down once you get successful. Like GoChat.

What did I do instead? After applying the CPU-expensive image recognition and doing resizing, the resized images are uploaded to Google Cloud Storage. This way the server and database don’t get hit for requesting images. The database should worry about data, not images. This saves many servers by itself. On the database side, I separate the snaps into a few different collections: all snaps, most liked snaps, newest snaps, newest valid snaps and so forth. Whenever a snap gets added, liked or marked as abuse, the code checks if it (still) belongs to one of those collections and acts accordingly. This way the code can query from prepared collections instead of running complicated queries on one huge pile of mess. It’s simply separating data logically into some simple buckets. Nothing complicated. But it allows me to query solely on the geospatial coordinates with one sorting operation, instead of a complex query as described above. In simple terms: it makes it straightforward to select data.

How much extra time did I spent on all of this? Maybe 2 to 3 hours. Why I did this in the first place? Because that’s just the way I set things up. I assume my apps will be successful. There’s no point in building an app assuming it won’t be successful. I would not be able to sleep if my app gains traction and then dies due to bad tech. I bake minimum viable scalability principles into my app. It’s the difference between happiness and total panic. It’s what I think should be part of an app MVP.

Choose the right tools for your MVP

As said, GoSnaps uses NodeJS as the backend language/platform, which is generally fast and efficient. I use Mongoose as an ORM to make the MongoDB work straightforward as a programmer. I’m not a Mongoose expert by any means and I know the library by itself has a huge codebase. Therefore Mongoose was a red flag. But yeah, MVP. At one point last weekend, our server’s 4 NodeJS processes were running at 90% CPU each, which is unacceptable to me for 800–1000 concurrent users. I realized that it had to be Mongoose doing things with my fetched data. Apparently I simply had to enable Mongoose’s “lean()” function to get plain JSON objects instead of magical Mongoose objects. After that change, the NodeJS processes dropped to around 5–10% CPU usage. Just the simple logic of knowing what your code actually does is very important. It reduced the load by 90%. Imagine having a really heavy library, like Symfony with Doctrine. It would have required a couple of servers with many CPU cores to just execute the code alone, even though the database is supposed to be the bottleneck, not the code.

Choosing a lean and fast language is important for scalability, unless you have a lot of money for servers. Choosing a language with a lot of useful available libraries is even more important, since you want to build your MVP quickly. NodeJS, Scala and Go are good languages that cover both of these requirements. They provide a lot of good tools with a lot of good performance. A language like PHP or Java by itself is not necessarily slow, but is usually used together with large frameworks and codebases that make the application heavy. These languages are great for clean object oriented development and well-tested code, but not for quick and cheap scalability. I don’t want to start a big programming language argument, so let me just state that this is subjective and incomplete. I personally love Erlang and would never use it for an MVP, so all your arguments are invalid.

My previous startup Cloud Games

Since cloudgames.com was still very static, I was able to migrate the MVP to NodeJS with Redis in a few days. Similar setup, different language. This led to an immediate decrease in load by about 95%. Granted, this had more to do with avoiding PHP libraries than with the actual language. But a minimalistic NodeJS setup makes more sense than a minimalistic PHP setup. Especially since MongoDB and frontend code are also 100% JavaScript, like NodeJS. PHP without its frameworks and libraries is just another language.

We needed this cheap setup, since we were a self-funded, early-stage startup. Cloud Games is now doing well and still based on a cost-efficient NodeJS architecture. We might not have managed to be successful with a more costly tech setup, given the fact that we’ve been through some really tough times as a startup. Designing a low-cost, scalable architecture has been essential for success.

MVP and scalability can coexist

Edit on July 21, 2016

Please like and follow!

Unboxd

User Generated Content Laid Bare

Unboxd

User Generated Content Laid Bare

Erik Duindam

Written by

I write about startup technology. Head of Engineering at Everwise. Co-founder & Board Member at Cloud Games.

Unboxd

User Generated Content Laid Bare