How We Deal with Friday Afternoon Bugs

Ibrahim Arief
Inside Bukalapak
Published in
3 min readJul 19, 2018

About a year ago, we found an edge case issue in our API that caused a tiny percentage of our iOS users to experience disruptions to their checkout flow.

Since data-driven is part of our core culture, we ran a quick impact analysis based on the tens of billions of data points that we collected daily and calculated that the issue is causing us to lose $8000 worth of transactions every single day.

Sounds scary, right? To top it off, we found the issue on a Friday afternoon, right when the team is about to log off for their weekend.

So what did we do?

If you guessed that the team spent their weekend fixing the issue, you guessed wrong! After calculating the impact, we simply called it a day and asked the team to enjoy their weekend and fix the issue in Monday instead.

The team, celebrating their weekend, probably

So the team did exactly that, came back Monday feeling fresh and reinvigorated, jumped to the issue, quickly found the root cause, and deployed a fix on the same afternoon.

But why would we choose to ignore the impact?

Well, it’s not really ignoring the impact. After all, we all can agree that $8000/day is not an insignificant amount of money. We can easily imagine scenarios where business leads screamed out of their lungs and pushed the team to fix the issue ASAP.

As with most other things in the world, we need to take a step back and obtain more perspective on the scale of things. Yes, the impact is not insignificant, but when we’re handling millions of dollars worth of e-commerce transactions per day at Bukalapak, in relative percentage $8000 is quite small.

Let’s take another perspective. Over here the majority of our 650+ tech talents are focusing on experiments and incremental improvements to our overall product. Teams often celebrate Big Wins, improvements that our A/B Test experiments have shown to deliver at least +1% growth to our global KPI. At our scale, those improvements could translate to tens of millions of dollars worth of additional annual transactions.

What this means within the context of having an issue with $8000/day impact is, why should we bother disrupting the team and pushing them to fix a 0.1% loss through weekend overtimes if it meant that they have less energy to finish a potential 1% growth improvement? Everyone, including our business leads, understands this very well and triaging production issues based on objective data becomes a powerful prioritization tool for us.

Taking yet another, perhaps more powerful perspective (and my personal favorite), this core value means that we respect the weekend of our teams more than $8000/day worth of transactions. How’s that for a work-life balance company? :)

P.S. We’re hiring! ;)

We have built one of the best tech workplaces in Indonesia, where we combine strong learning culture, mutual respect, and amazing positive impact on millions of Indonesians. This year alone we’re on track to hire more than 300 new tech talents to help accelerate our growth even further. Sounds interesting? Check out our career site, we have tons of interesting roles!

P.P.S. All the pics (except the kid’s facepalm) came from our Work at Bukalapak video, check it out if you want to see how it feels like joining our tech family!

--

--

Ibrahim Arief
Inside Bukalapak

Father, husband, servant-leader for 700+ top-class engineers, and Computer Vision Ph.D. dropout, in that particular order.