Behind the scenes of Singapore’s nationwide ticket operating system for the National Day Parade

Open Government Products
Open Government Products
8 min readAug 8, 2023

By: Foo Chi Fa, Justyn Oh, Teo Shu Li

Ticket balloting and operations can be challenging when huge demand overwhelms the entire system. With recent news of poor concert ticketing experiences affecting users’ trust, the pressure was on for our FormSG team to deliver a better ticketing experience for National Day 2023, including the National Day Parade (NDP) and GetActive! SG Heartlands Festivals. Through our technical implementation, FormSG managed to support 2 nationwide ticketing operations for over 400,000 NDP ticket ballots, and over 15,800 tickets for the GetActive! SG Heartlands Festivals held across the nation.

Challenge #1: NDP ticket balloting

Preparation for NDP ticketing started weeks in advance. The NDP committee created a form for Singaporeans to ballot for tickets, and Open Government Products’ FormSG team undertook capacity upgrades to improve the resiliency of the system. However, one crucial challenge remained: we didn’t know how many Singaporeans would be using FormSG the moment the balloting form opened.

This is a problem for several reasons. Firstly, capacity upgrades can only take us so far. It is not cost-efficient to invest in long-term infrastructure for such transient, occasional increases in load. Also, we rely on various downstream systems such as Singpass logins and SMS and email sending, which have their own capacity constraints. These constraints in particular have had adverse production impact in the past, including affecting NDP-related forms in previous years. Secondly, we had estimates from past data, but they could be incorrect. What if twice or even ten times the number of users showed up, especially for the crucial first few minutes? No matter the scenario, we wanted to be able to handle the additional load smoothly.

The team came to the conclusion that we needed an upstream waiting room system to titrate the influx of users to our system in a controlled manner. Since the ticket application was a lottery system and not a first-come-first-served system, the order in which users entered the form did not matter. The critical function of the waiting room would be to protect the application itself.

Implementing the waiting room in 3 days

The decision to build the waiting room was made a week before NDP launch, and the team set out with the goal of building a production-ready system within 3 days, plus some buffer for contingency and further iteration. Given the tight timeline and the need for the system to be 100% reliable, the team made the following technical decisions.

Goal 1: The waiting room should scale very quickly in response to load, and act as an effective shield for the main system.

We implemented the waiting room as a reverse proxy using Cloudflare Workers which gradually allowed users into the FormSG form. By using Cloudflare Workers, the waiting room was decoupled completely from FormSG infrastructure, with practically no limits on scalability (as opposed to a server-instance based solution). Meanwhile, deploying this as a reverse proxy also meant that we could guarantee full control over entry to the form, as compared to a redirect mechanism which could be bypassed.

Goal 2: Keep the system as simple as possible to avoid points of failure and achieve 100% reliability.

In order to ensure 100% reliability under load, we deliberately avoided the use of a database or centralised data store to coordinate entry time between the workers, as this could introduce points of failure (e.g. concurrency, latency). Instead, we relied on randomisation at the individual worker-level to achieve load spreading, without the need for central coordination.

(If you’re interested in the math, refer to the technical note in the Annex.)

Goal 3: Waiting room load times should be fast and snappy.

We made use of caching as much as possible. The waiting room itself was encapsulated in a static HTML+JS script. Dynamic content on the remaining wait time was rendered based on an entry time variable stored in a browser cookie. Therefore, each user was served exactly the same (cached) content. This approach allowed us to fully leverage Cloudflare’s caching capabilities, further reducing load times for our users.

Design considerations

In addition to the engineering effort, we wanted to make sure that the user experience of being in the waiting room was not frustrating for users. In particular, we wanted to communicate to users that they were indeed “moving” in the queue, and that no further action from them was needed such as refreshing the page when their entry time was approaching.

For these reasons, the waiting room page displayed the time remaining until their admission in minutes, which was dynamically updated on the page. Additionally, we had an image of a series of cute cats on the waiting room page to entertain users while they were waiting!

The final design of the waiting room enabled FormSG to handle over 400,000 ticket ballots without a hitch. Of these, 70% were submitted within the first 24 hours the form was opened. To put this in perspective, in a typical week, FormSG handles 400,000 form submissions, but spread evenly over 7 days.

Challenge #2: GetActive! SG Heartland Festivals ticketing

Several weeks later, the FormSG team was roped in again to support ticketing operations for NDP heartland activities. However, for this form, tickets applications were on a first-come-first-served basis, which meant that giving each user a uniformly random entry time would not suffice. Additionally, we had to ensure that the queueing system was resistant to tampering and manipulation. This time round, we wanted to build a fully robust and performant first-in-first-out (FIFO) waiting room system.

Implementing the FIFO waiting room system

Using Cloudflare Workers as an on-the-edge, serverless solution worked well for NDP ticket balloting. As such, we continued to explore how we could improve our existing waiting room to support this use case.

Goal 1: Implement a FIFO system. Concretely, this means that if two users load the page at times $t_1$< $t_2$, user 1 should be allowed into the form page no later than user 2.

Given that our setup was edge-deployed, there was no way to implement a coordinated FIFO system without access to a central coordination mechanism. The first point we investigated was how to store the information about the current state of the queue in a central manner. We decided to use Cloudflare’s Durable Objects, which provides consistent, low-latency and persistent storage across workers.

Another feature of our original solution was that we did not fix the number of users allowed to be on the form page at one time (i.e. “servers” in queuing theory jargon — not the physical CPU server). To keep the system simple, we did not want to implement a feedback mechanism which would update the user clients when a form response was submitted and a new “server” opened up. This meant that issuing queue numbers to each client would not work. Instead, we used a binning system where each bin represented an entryTime (in order). Also, each bin had a manually-set, predetermined maximum capacity. On load, the user is placed into the lowest-numbered bin which still has space. This entry time was saved as a cookie on the client, and all clients belonging to the same bin are allowed to access the form at once, at their associated entryTime.

Other than the fact that this preserved the desired FIFO behaviour, this design provided a few additional benefits. It allowed us to provide a concrete, specific and correct wait time to users. Implementation-wise, it also required very little data to keep track of queue state — since the entryTime is monotonically increasing, we only needed to know the current entryTime of the bin being issued, and the remaining capacity of that bin. Each client only had to access the durable object once to retrieve their entryTime based on the state, ensuring that we issued a minimal number of requests to the durable object which helped to ensure its stability.

Goal 2: The cookie storing the user’s entry time should be resistant to tampering and manipulation.

To solve this issue, the cookies we issued were signed JWTs with various defence mechanisms that prevent sharing and spoofing of cookies.

The signing of the cookie ensured that entry times could not be tampered by users. If the page detects that the cookie has been tampered with, the cookie is removed and the user is pushed to the end of the queue. This design allowed the platform to be resilient to the load on the actual form.

During the first minute after the form opened, we received 15,860 requests. This load was effectively distributed so that the average and median waiting time was 11 seconds, which is much lower than the previous version of the waiting room. In total, almost 13,000 tickets were issued, with 99.95% of users being given accurate first-come-first-served waiting times. The remaining 0.05% of users had slight variation in their waiting time, on the order of a few seconds. This occurred due to 1) a handful of occurrences of read errors between the Cloudflare Worker and the Durable Object, as well as 2) at the end of the waiting room deployment when the queue had already dissipated and we closed the waiting room. Rest assured however that no one was adversely affected by this in securing tickets.

Conclusion

Featuring our FormSG team bonding after the nationwide ticketing operations

Featuring our FormSG team bonding after the nationwide ticketing operations

Through both nationwide ticketing operations, we received an average rating of 4.85 out of 5 for our form submissions. It was by no means an easy feat, and we continued to iterate with the ultimate aim of optimising citizens’ experience and contributing to public good.

The next time you ballot for NDP tickets, look out for our waiting room and possibly other new features as we continue to make a smooth ticket balloting experience for everyone.

FormSG is an open-sourced product by Open Government Products. Check out our GitHub repository here.

Annex

--

--

Open Government Products
Open Government Products

We are Open Government Products, an experimental division of the Government Technology Agency of Singapore. We build technology for the public good.