BREW : The Multi-Tenancy Project

Eric Lin
ShopBack Tech Blog
8 min readAug 6, 2024

--

Better Engineering Weeks (BREW)

Something’s… brewing 😉🍲

Since Q3 2022, ShopBack has dedicated two weeks each quarter to our Engineering team for Better Engineering Weeks (BREW). This initiative aims to tackle high-impact projects that cross team boundaries, driving significant progress in engineering investment areas and fostering greater collaboration across the organization.

The Whys :

  1. Enhancing Engineering Life : We strive to make our infrastructure more stable, operable, and cost-efficient;
  2. Planning for the Future : We aim to proactively plan for our technological future;
  3. Building Collaborative Teams : We believe in fostering collaboration by working together on impactful projects.

In this series, ShopBack Engineers will reflect on key BREW projects which have then become core initiatives that have transformed how we work and how the Group operates — and it all started… with a little brew ☕ #OwntheProblem

More about me

Hi, I’m Eric. I started using ShopBack in 2018 and quickly became a loyal customer, always taking advantage of Cashback opportunities whenever I needed to purchase something online. Given ShopBack’s extensive reach and large user base across many markets, I was keen to join the company and was fortunate enough to become a ShopBacker in February 2020.

At ShopBack, I work as a Staff Engineer, mainly focusing on member service related features. Our team handles most functions directly related to users. Over my four years with ShopBack, I have witnessed the company’s rapid growth and the swift pace of technological advancements, which have consistently enhanced my efficiency and our service stability.

During the Multi-Tenancy project, I served as a team lead. My primary responsibilities included understanding the company’s technical strategy and implementation details. Before applying it to our team’s service, I conducted Proof of Concept (PoC) tests to verify feasibility. I assigned tasks to team members, ensuring the project stayed on track and aligned with the company’s expectations.

Member service team

Why Multi-Tenancy?

Before this project, ShopBack was operating in 11 different markets, each with its own independent infrastructure to ensure that issues in one market would not affect the operations in other markets. This set-up, known as Single-Tenancy, became increasingly problematic as the number of markets grew. Launching and expanding into new markets required setting up entirely new infrastructures, demanding significant engineering effort and increasing costs.

To address these challenges, we initiated a project to allow markets within the same region to share infrastructure resources. This shared infrastructure approach, or Multi-Tenancy, aimed to streamline resource allocation and reduce redundancy.

Additionally, during this project, we transitioned all services to use AWS Aurora PostgreSQL databases. Previously, different services used various databases, such as MongoDB and MySQL. By standardizing on a single database system, we reduced the complexity and cost of maintaining multiple databases. This change not only deepened our understanding of a single database technology, enhancing service reliability, but also leveraged AWS Aurora’s horizontal scaling capabilities. This significantly reduces the manual effort required to adjust database resources before high-traffic event days during the shopping season.

I will share some key highlights from the Multi-Tenancy project where the Engineering team was pushed beyond their boundaries in the pursuit of transformative change.

Rewriting ~500 MongoDB and MySQL queries within 1 month

For the member service I was responsible for, the biggest challenge was rewriting nearly 500 MongoDB and MySQL queries. The member service had been developed for more than five years, with our customized SQL builder and a pure MySQL client library (mysql2) to handle SQL commands directly. This provided maximum control over SQL commands tailored to our specific use cases.

However, one of the project goals was to transition to AWS Aurora PostgreSQL. This meant replacing the MySQL client library and reworking our program architecture, which is tightly coupled with MySQL, essentially requiring a complete rewrite of the database access logic. We decided to introduce TypeORM, leveraging its query builder and connection pool features. Additionally, most of our company’s services already used this ORM library, allowing us to share knowledge and experience.

To rewrite all MySQL and MongoDB queries within a short timeframe, it seemed intuitive to have all team members work in parallel from scratch. However, this approach risked redundant logic being written by different team members. Instead, we had two team members first abstract the common logic for MySQL and MongoDB queries, creating our own DB access framework based on TypeORM. This groundwork allowed the entire team to rewrite all queries simultaneously and efficiently. As a result, we completed the DB query rewrite task within 1 month.

Meeting Unexpected BAU Traffic Load Testing Targets within 3 days

The biggest surprise in this journey was our load testing for Business As Usual (BAU) traffic failing to meet the targets, even causing the entire testing environment to go down. This was primarily because at least 50% of ShopBack’s traffic needed to pass through the member service, handling tasks such as user authentication and authorization, and providing user profiles to both internal services and ShopBack platforms (apps / web / extension). When the member service couldn’t handle the traffic, most functionalities would be unavailable.

We anticipated that the combined traffic from multiple markets would be at least five times that of a single market. Since the infrastructure in Singapore was handling traffic from nine Asian markets, we expected to manage BAU traffic by scaling up infrastructure resources. However, the issue wasn’t a lack of service resources, but rather the database’s inability to handle the high volume of queries. Simply increasing database resources would have significantly raised costs, countering the project’s original goal.

During load testing, member service consumed ~5k DB connections

Within three days, we identified the performance bottlenecks, rewrote the database queries, and implemented caching to significantly reduce database load. These adjustments allowed us to successfully meet the BAU traffic load testing targets without disproportionately increasing costs.

Staying the Course, with Gusto

Multi-Tenancy was not just an “everyday” project. While daunting, the alignment and motivation of the team were significantly supported by our Senior Staff Engineer Matthew, CTO San, VP Stan, and other technical senior leaders, who clearly communicated each technical decision and consistently reminded us of the benefits this project would bring upon completion. This clear communication helped keep the entire Engineering team aligned and motivated.

Our approach involved having multiple Staff Engineers from each technical group participate in discussions about major technical decisions. Before announcing these decisions to all Engineering teams, we ensured that the Staff Engineers were aligned. This was followed by sharing guidelines and collecting feedback and concerns through various asynchronous communication tools like Confluence, Slack, Google Sheets, and JIRA. This transparency ensured that everyone understood the reasons behind each decision and knew the latest progress.

Engineering Managers and Staff Engineers led the direction for each service, ensuring that everyone was moving in the same direction. Each major feature had an assigned technical owner and a dedicated Slack channel for discussing implementation details, which facilitated clear and focused communication. This structured approach enabled us to maintain alignment and motivation throughout the project.

Conquering Mount Multi-Tenancy ⛰️

This project was the most complex and large-scale project I have ever participated in. Some of my key takeaways from this “expedition” :

  1. I learned a great deal about technologies I was previously unfamiliar with. This includes AWS Aurora PostgreSQL, service mesh, progressive delivery, and feature flag. These tools not only helped us efficiently complete this project but also provided us with the flexibility to continually improve service stability.
  2. In terms of cross-team communication and collaboration, I gained insights into how leaders manage large-scale project development. I observed how they adjusted directions and decisions based on our progress, helping to clear roadblocks and ensure timely completion.
  3. The project also encouraged me to think outside the box when solving problems and not to limit myself to old methods.
  • For instance, during the final phase, we involved the entire company (over a thousand employees) in internal testing — which was something I wouldn’t ever have imagined us doing previously.
  • Aside, since the member service required significant time and manpower to assist with test account issues while team members were also fixing bugs, it was really exhausting and overwhelming. To streamline the testing process, we then used a Slack bot to answer questions and integrated efforts with other teams, such as QA and Tech Support teams, to resolve issues encountered during the process. This approach enabled us to successfully complete internal testing within the deadline.

I am most proud of leading our member service team to successfully complete the project within the expected timeframe. Given the numerous challenges we faced, including tight deadlines and the introduction of many new tools, the task initially seemed almost impossible, and filled with uncertainties. However, our team members worked incredibly hard, proactively helping each other and putting in their best efforts to see the project through.

It wasn’t just our team; the entire ShopBack Engineering team came together to accomplish this project. We successfully rolled out the new service collectively, demonstrating outstanding teamwork and dedication. Being a part of this collaborative effort and witnessing our success made me feel very proud to be an Engineer at ShopBack.

More on Engineering Culture at ShopBack

What I enjoy most about working on engineering projects at ShopBack is the collaborative and innovative environment. The team culture encourages open communication and knowledge sharing, allowing us to learn from each other and grow together. Each project presents unique challenges that push us to think creatively and implement cutting-edge solutions.

Additionally, the support from leadership and the clear vision for each project make it a rewarding experience. Knowing that our work directly impacts the company’s success, and improves the user experience is incredibly motivating. The opportunity to work with a talented and dedicated team on meaningful projects makes every challenge worthwhile and fosters a sense of pride and accomplishment in our work. #SucceedasOne

❗️ Interested in what else we work on?
Follow us (ShopBack Tech blog | ShopBack LinkedIn page) now to get further insights on what our tech teams do!

❗❗️ Or… interested in us? (definitely, we hope 😂)
Check out here how you might be able to fit into ShopBack — we’re excited to start our journey together!

--

--