How We Design Backend Service for New voilà.id

7 min readApr 26, 2024

Crafting #1 luxury multibrand store in Indonesia, from concept to codes.

Introduction

Founded in 2020, voilà.id curated luxury collections, showcasing premier brands bags, shoes, clothing, and accessories. Our e-commerce platform initially operated on top of Shopify, seamlessly integrated with our internal inventory and fulfillment management system, known as the X-Management System (XMS).

After over two years of operation, we embarked on the journey to rebuild voilà.id from scratch. The decision to rebuild voilà.id comes from several key factors:

Scalability: While Shopify provided a solid foundation, our growing business demanded a more scalable and customizable solution. Building our platform would offer greater flexibility to adapt to future growth and changes in consumer demands.
Customization: To deliver a unique and tailored experience to our customers, we required greater control over the platform’s features, user interface, and backend processes. Developing our solution allowed us to implement customizations specific to voilà.id’s requirements.
Integration: With our in-house X-Management System (XMS) serving as the backbone of our operations, integrating it seamlessly with voilà.id was crucial. Building the platform from the ground up enabled us to optimize integration with XMS, ensuring efficient inventory management, order fulfillment, and data synchronization.
Performance: As an e-commerce platform catering to luxury clients, performance and speed are paramount. By developing voilà.id internally, we could fine-tune performance optimizations, ensuring a seamless and responsive shopping experience for our customers.

After a 1-year long project, we’ve launched the new voilà.id by the end of February 2024, with considered as a huge success in terms of system stability, performance, and infrastructure costs.

This post will provide you an insight into how we designed the backend service for the new voilà.id platform.

Design Phase

A. System Architecture

As we start this project, we choose the microservice architecture for our backend service. We adopt graphQL as our user-facing API with Federated-graphQL architecture.

This approach involves combining multiple independent graphQL APIs, referred to as “subgraphs”, where each represents a distinct microservice with its own schema and data sources. These subgraphs are unified through a Router/Gateway, which serves as a central entry point for client requests and handles query decomposition, execution distribution, and result aggregation.

This architecture promotes modularity and scalability, facilitating efficient communication and optimal performance across our microservice, which will cater voilà.id system growth in the future.

The Gateway, also known as the Router, assesses incoming queries against a unified “super” schema comprising all “subgraph” schemas. It identifies the microservice accountable for each field and forwards the relevant subqueries accordingly.

Each microservice independently resolves its designated subqueries and returns the corresponding data. Subsequently, the Gateway consolidates this data and constructs a unified response mirroring the structure of the original query. This consolidated response is then transmitted back to the client.

B. Observability

When we design a microservice architecture, the design and implementation of monitoring and observability should be part of the design discussion from day one itself. As an organization, we should decide what tools we choose for observability before going into a new project. There are three pillars of observability.

Metrics
Traces
Logs

Metrics can be either system metrics or application metrics. Metrics are used to monitor application or system health. We also use metrics to set up alerts to detect any deterioration of application health before it becomes catastrophic. Traces and Logs are used for observability that gives more context of “why” the system behaves, such as slow process or when an incident happens.

We start with weighing down “buy vs build” decisions, documenting our findings in Request For Comments (RFC), conducting Proof of Concept (POC) trials, and then reviewing by our engineering team.

After thorough evaluation, we make the choice to construct our monitoring system utilizing the self-managed Elastic APM. This decision was grounded in our commitment to optimizing engineering costs effectively.

Leveraging Elastic APM, which is both free and open-source, was a clear win. However, it came with its own set of challenges. We accepted the responsibility of managing the observability infrastructure by ourselves and custom-coded our instrumentation to fit seamlessly into our architecture, implementing the open telemetry standard.

Another crucial aspect is Proactive Alerts. Right from the start, our engineering team needs to be the first to know about any issues, rather than our users or stakeholders.

Being proactive means more than just receiving alerts about incidents; it involves having the necessary instructions to swiftly identify and pinpoint the problem. We achieve this by making sure our error logging follows a standard formats we made. We utilize Elastalert, an open-source tool that we’ve customized to generate alerts tailored to our specific needs, enabling us to promptly address any issues as they arise.

C. Code Quality

Code quality is the most important goal that we should pursue in our project. If neglected, it can cause a devastating effect in the long run. As we will have a lot of microservice code base, our project will become hard to maintain, which lead to increasing cost, and the motivation of developers to work with such a project declines.

To overcome this we carefully prepared several strategies:

Setup Standard Code Reference: such as the code repository template we called “skeleton codes”, and an internal standard package we called “catalyst-sdk”. This helps significantly to optimize our development process, applying the same quality standard across microservice, uniformity, and accelerating feature deployment.
Standard Local Code Tooling: is typically a make file that contains several utilities that help to speed up the development process such as: automating environment setup to make our local environment ready in seconds when jumping into different microservice repositories, running standard code quality checkers locally, generate mocks, easily run the code repository locally, and so forth.
Write Standard Code Documentations: a lot of standard code documentation are written in Confluence. These documents play a key role in assisting developers by providing insights into the standardized code implemented across microservices within each code repository.
Code Reviews: we make a policy that everyone in the team should take part in the code reviews as their KPIs. Code Review is a major opportunity to elevate developer skills and spread the standard code practices.
Setup Code Quality Checker: such as Pull Request (PR) GitHub Actions to check Build Test, Unit Test Coverage, Static Code Analysis, and Linter automatically whenever the developer submits a PR.

D. Infrastructures

With a microservice architectural framework, we’re using Kubernetes clusters where we aim for a setup that facilitates autonomous and loosely coupled services while ensuring seamless inter-service communication. For critical dependencies like databases and search platforms, we leverage managed service instances from cloud providers. This choice simplifies the complex tasks of ensuring system availability, auto-scaling, and maintaining data consistency.

These infrastructures are aimed at:

Orchestrate Deployments Seamlessly, Abstracting Infrastructural Complexities for Developers: we use Helm charts, and Github Action workflows to trigger build and deployments. This simple workflows already check our lists, yet optimizes cost since we don’t need to invest in any licensed CI/CD tools.
Manage Load and Enable Smooth Scalability of Applications: each of microservice utilize Horizontal Pod Autoscaling to manage loads and make sure the optimal infrastructure resources. This will help us to have the peace of mind for fluctuating user traffic, and improve our infrastructure costs.
To Have a Single Dashboard for All Infrastructure Performance Metrics: we implement Prometheus, to collect various metrics related to our infrastructure from every pod in our system. These metrics encompass a wide range of data points such as resource utilization, performance, and health indicators. Additionally, we gather metrics from managed resources provided by our Cloud Provider. These diverse sets of metrics are then aggregated and visualized together in a single Grafana dashboard, offering a centralized and comprehensive view of our system’s performance and health status.

Finally!

The new voilà.id was launched smoothly on late February 2024, marked by a notable absence of significant issues.

The transition of user traffic from Shopify to our new platform is seamless, with “0” disruptions to back-office operations. The traffic peaking at notably ~100rps with ~247 ms p95 latency without any infrastructure performance glitch.

After one month, we’ve observed that our incoming issues are quite minimal and can be detected early without relying on user or stakeholder reports. With our current architecture, the possibilities for future enhancements are endless, and making continuous improvements is a breeze!

About the Writer

Mirza Akbar Mulya S is a Principal Engineer at Catalyst. He has always been excited to orchestrate technology, people, and processes. He loves to explore new technology, and run tech-enabled products that scale reliably.

About Company

Catalyst is the holding company of two successful eCommerce: jamtangan.com and voilà.id. Our Vision is to be the no.1 e-commerce company in Indonesia by understanding our customers’ needs, and continuously innovate on their behalf to construct the best e-commerce website and application in Indonesia. We believe in Catalyst’s greatest asset: Our fundamentally strong and loyal team. We work to win.

Interested? Join us here: https://career.ctlyst.id/jobs!