Identifying bottlenecks in performance in a simple web app: Teamo

Published in

TeamoCard

2 min readSep 4, 2020

An exercise in identifying bottlenecks in Teamo’s API response time

Not long ago, I read Joseph Gefroh’s How I Scaled a Software System’s Performance By 35,000%, which is such a gem of a read. In short, here are some pearls of wisdom he shared on improving the performance of a fundraising platform:

Optimization usually involves into 2 categories: activity (request/second, CPU usage, memory usage, connection count) and performance (response time)
Standard ways: horizontal scaling (have to ensure thread-safe operations), identifying N+1 queries (fetch all in initial query), refactoring inefficient code (e.g. move querying as close to DB as possible), backgrounding (i.e. making request asynchronous), asset minification, solving memory leaks, and co-location (e.g. putting storage close to compute instance to reduce latency).
Challenging ways: caching & dealing with stale caches

A delightful user experience usually involves fast response time, and at Teamo, we are constantly improving not just the UI, but also in our performance. Teamo’s a small web app for now, but it could benefit from some optimization strategies.

To enable us to better improve our performance, we have New Relic logging and alerts set in place to let us understand where the bottlenecks are. Since user activity for Teamo is not consistent (as we are a platform where teams create cards for ad-hoc events), using average response time may be misleading and overly optimistic. A more meaningful metric would be the 95th percentile when there is high user activity.

With alerts set in place, we could attribute high response time to specific cases: (1) card/message creation with photos (2) payments. Though Teamo is media-heavy, caching has served us well when loading the assets. Hence, we don’t see high response time in GET requests.

For the first case, we have optimized it with asset minification. Joseph’s article gave us good insight — colocation. Since we used an external object store, we could improve response time by moving the bucket into the same region as our server. We’ll be happy to share results if we were to embark on such a move.

The second case, however, is tricky, as we use Stripe API for payments and the latency could be expected of card payments due to verification checks. The best way to optimize it would be code refactoring.

As we continue our engineering journey, we would share more insights. Do let us know if you experience any issues on Teamo! We would be more than happy to help.

Teamo’s a platform to celebrate work occasions and keep your team engaged with social and lively group cards. 🙌 Visit us here: https://teamocard.com/

Identifying bottlenecks in performance in a simple web app: Teamo

Written by Hwee Lin Yeo, Alexandra