Blibli.com: Bidding System
The offering of particular prices for something, especially at an auction.
— Bidding
Intro
Hell-O readers, in this article I want to share what we have done in Blibli.com Mobile App for bidding during the last 12/12 event. Anyway, if you’re one of the winners of this event, Congratulations !!
Product Manager’s Requirements vs Developers’ Thoughts
In a meeting
PM : Okay guys, here is the design. *pointing the left image* We will have 12-days event and the main event is on 12/12. We will have Bid Calendar to show the highlighted product
Dev : Easy, I’ll call the API and show the data
PM : We will have 3 sessions each day and join button will only shown when the time is come
Dev : Easy, I’ll create timer and do countdown and show the button
PM : *pointing the middle image* This is our bidding room.
- we will have 6 products
- customer can bid by tapping the button and then the price will increase
- the current winner and bidding state (going once, going twice, etc.) will be shown real-time.
- each session will have 5 minutesDev : Easy, I’ll call the API every 1 sec to get the data and create the timer and do countdown
PM : *pointing the right image* And this is the winning announcement, we will show the product that the customer win at the end of event
Dev : Easy, we will call the API to get winning products when the timer is 0
“CFO” : That’s it, we have 3 weeks before the 1st day event
Dev : tf dude, this is PM vs Dev section and you suddenly appeared here…
200K users will fight for $1 Mini Cooper
Expectation vs Reality
Expectation
- Call API every 1s to get the real-time result
- Call API when user tap “Bid” button and get the result after it
- Call API when the timer stops (bidding session ends)
Reality
- Call API every 1s to get the real-time result
1 API call/sec x 200k users x 5 min= System Down - Call API when user tap “Bid” button and get the result after it
(1 Bid API + 1 Result API) x 200k users x 6 products x how-fast-users-thumbs-when-see-$1 Mini Cooper = System Down - Call Winning API when the timer stops (bidding session ends)
1 API call x 200k user at the same time = System Down
Easy? Not anymore.
“then it is backend team issue, just scale up the backend, increase the bandwidth and blablablabla” — Someone said
If that is in your mind, you are right and wrong at the same time
Wondering why it is wrong?? Alfonsus, who worked on the backend side, explains in the different article, but if you really want a quick explanation, you can directly email him :)
Thanks to Google
Instead of scaling up our backend, we chose to explore another thing, thanks to Google who make our life easier.
Firebase Realtime Database (RTDB) allowed us to store data as JSON and synchronized them in real-time to every connected client. Instead of doing hundred thousand HTTPS calls to our backend, we utilized Firebase RTDB to reduce API calls to our backend.
I will not share how RTDB helped us or how we utilized the RTDB in detail, you can read RTDB docs to help you understand more :)
RTDB claimed that they can handle 100k simultaneous connections per sec. It would only be half of our target users and it can be scaled up also, so it can handle 200k and the most important thing is they are Google, so we believe them :)
Long story short, we used RTDB to store our data and let our app listened to it. When we have data change in our database, our backend will push the data to RTDB and it will push the data to our app. Here is the diagram…
This data flow helped us to solve problem #1 and #2, see this…
- Call API every 1s to get the result real-time
1̶ ̶A̶P̶I̶ ̶c̶a̶l̶l̶/̶s̶e̶c̶ ̶x̶ ̶2̶0̶0̶k̶ ̶u̶s̶e̶r̶s̶ ̶x̶ ̶5̶ ̶m̶i̶n̶=̶ ̶S̶y̶s̶t̶e̶m̶ ̶D̶o̶w̶n̶ - Call API when user tap “Bid” button and get the result after it
(1 Bid API + 1̶ ̶R̶e̶s̶u̶l̶t̶ ̶A̶P̶I̶) x 200k users x 6 products x how-fast-users-thumbs-when-see-$1 Mini Cooper = S̶y̶s̶t̶e̶m̶ ̶D̶o̶w̶n̶
Since running apps were synchronized in real-time, we didn’t need to hit API endpoints anymore. Our backend team did something also to make sure 200k Bid API/sec will not make the system down, but it will be just another story :)
Then, how about #3??
3. Call Winning API when the timer stops (bidding session ends)
1 API call x 200k user a̶t̶ ̶t̶h̶e̶ ̶s̶a̶m̶e̶ ̶t̶i̶m̶e̶ = S̶y̶s̶t̶e̶m̶ ̶D̶o̶w̶n̶
We did a trick here, by calling the API end-point after certain random delay interval… Damn!! hard to explain…
Example:
Bidding ends at 10:00:00 (10 o’clock), normally we will call the API exact at 10 o’clock, instead of let 200k requests hit our backend at the same time, we do randomize X (extra time in second) where 0 ≤ X ≤ N (N is configurable). Then we add the X to 10 o’clock.
if X = 3 then we call the API at 10:00:03, if X = 10 then we call the API at 10:00:10, eventually, the API calls are split up based on the X value
Well, yes, it is based on luck also, we are hoping our 200k randomized instances not giving the same value :)
Everything went well
One week before the event, we performed testing. With 10 to 12 team members with their own phones, we simulated bidding event…
WOW, it was 100% meets the requirement. When I bid, my friend got the result in no time. At the end of the simulated bidding event, winner names popped up as expected. We’re so proud of our result and we felt we were ready for the event…
But everything changed when the Fire Nation attacked
Fire Nation Attacks
Few days before the event, we realized our troops are only 12 people, we simulated the bidding with only 12 phones. We never test with 200k users.
Then we did a performance test, simulated thousands of users in the bidding screen, can you guess the result?
Backend still up and RTDB can handle the data changes, the issue was in the mobile app.
hundreds bid requests = hundreds data changes = hundreds UI updates per second.
Can you recognize the problem? Hundreds of UI changes in mobile devices at fractal of a second, FROZE the device. High-end devices could still hold their grip and seem like swaying around. But regular device just froze, unusable.
We always thought about how to guard our backend, but we never thought about how mobile app will handle 200k UI update.
Only the Avatar, master of all four elements, could stop them.
Avatar: The Last UI Bender
Our main problem, the UI got updated hundreds of times within a second. It froze the device.
Hold and behold, we still have some tricks under our sleeves :)
We performed UI refresh with a certain interval. Instead of updating upon new data arrival, we put certain scheduler to update it within an acceptable refresh rate. No matter how many push data coming from the server, we will only reflect it to UI within certain update interval.
Example: Interval = 100 ms
Within 0.5s we have 50 RTDB changes, then we will only update the UI 5 times (every 0.1s).
For a backup plan, we have done this mechanism on both mobile app and backend side.
Dude, then it is not REALTIME anymore!!
Well, yes, this is not realtime anymore, but at least we save our customers experience :)
The END
You must realize, there were many workarounds we have done in our development, which might not be good in the future. We still do improvement until this day to reduce the workarounds.
If you have a great idea to improve things, I would love to discuss them with you. Please contact me via email or if you want to know more how we “create” things, please reach us from this career page