Scaling our registration system by a factor of five

Published in

Freetail Hackers

4 min readDec 2, 2023

In my last article, I showed how you could host a full-stack app for managing your hackathon or other event for almost free. Now I’m here today to talk about the lessons I learned scaling our platform Rodeo from an attendee count of 150 to 750.

…

It turns out I didn’t have to do much at all. The biggest problem was timeouts caused by running out of database connections. In a traditional deployment model, you might upload your code to EC2, and your instance will open a single connection to your database. You won’t need additional database connections until you exhaust the compute of your EC2 instance. This is in contrast to the serverless model where every request might open its own database connection. Creating new connections is expensive, and I started getting messages like “timed out fetching a new connection from the connection pool” in my error logs.

Luckily, this is a well-known problem with an easy solution: enable connection pooling. A connection pooler is a piece of software that sits between your database and the outside world. It internally manages a fixed pool of connections (hence the name), and when a request comes in, it assigns a connection from the pool to complete the request instead of creating a new one. When the request finishes, the connection is returned to the pool instead of being destroyed. This greatly increases the number of serverless functions that can concurrently connect to one database. Luckily, Supabase has built-in support for PgBouncer, and ever since turning that on (just in time for them to announce its deprecation, but that’s the next tech director’s problem 😋), I haven’t gotten a single connection exhaustion error again.

This post would be boring without any images, so here’s a drawing I made to illustrate connection pooling.

The second scalability challenge was implementing the functionality to download files users have uploaded, like their resumes. As a principle, Rodeo tries to do as much work in the backend as possible so it provides a usable experience for users without JavaScript, including server-side rendering. However, this would not work for files because Vercel has a 4 MB size limit for all requests and responses. We had over a thousand resumes uploaded to our system that amounted to over 200 MB in total, so there was no way that was going to fit in a single response. Instead, we had to download each file one by one from the client side. Initially, I was worried that we would take down either Vercel, Supabase, and/or S3 by making a four-digit number of requests in the span of a minute, but funnily enough, all of those held up fine. What wasn’t fine was the browser: it turns out that if there’s too many pending requests, Chrome will start throwing ERR_INSUFFICIENT_RESOURCES. (Safari and Firefox worked fine.)

The solution? We ended up throttling the requests in batches of 100, so that the next 100 files would not start downloading until all 100 previous files have finished downloading. This slowed down the process by maybe 20% (on a pretty fast network; could vary by connection quality), but it’s the best I could do—if anyone has run into this problem and found a better fix, let me know.

I couldn’t think of anything else to put here, so here’s what downloading 1600+ objects from S3 looks like in the network inspector.

As for everything else, it pretty much just worked…we used 7% of Vercel’s free tier, peaking at 350 MB of bandwidth, 45K requests, 35K serverless function invocations, and 0.78 GB-hours of execution time. Supabase reported we never went above 3% CPU usage. I was very happy with the effortlessness with which everything scaled.

So if you were expecting a grandiose technical deep dive, I’ll have to apologize…scaling Rodeo turned out to be very unremarkable. I can’t complain, though! I’ll end with an update on our progress on Rodeo. Since my last post, we’ve added:

Email/password authentication with Google and GitHub OAuth
Fully editable registration questions, schedule, information board, QR code scanning options, and email templates with broad Markdown support
Timezone support
Statistics for registration questions, admissions, and QR code scans
Sponsor portal that supports choosing which questions are visible to sponsor accounts
Global user list with searching and filtering by registration questions, QR code scan counts, and more
Export user data as CSV and user file uploads to ZIP functionality
More granular settings to let you do admissions the way you want to (rolling, first come first serve, “early/priority” and “regular decision,” in waves, etc.)

We’re really quite proud of how far we’ve come, and I’m hoping we can launch it before I graduate this spring. Let me know if you want to play with a testing instance!

Scaling our registration system by a factor of five

Written by Daniel Ting