How we Scaled our Billing Platform

Velu Alagianambi
Velu Alagianambi
Published in
6 min readApr 13, 2018

--

Billing is one of the key components of any company that charges fee for tendering service(s). If not more, it needs to be at least treated as equally as the company’s product, since it is the backbone of all business back-end systems.

Billing also plays a significant role in earning customer’s trust, directly impacting the customer’s lifetime value to the company. Hence, any changes to the billing system need to be carefully thought out.

Companies in their early stage tend to start with simple processes, that may not scale well. As the business grows and they would soon hit a point, that inevitability requires a major refactoring of the back-end processing. I will walk you through my journey.

Understanding the various components of a Billing System

Photo by Jason Briscoe on Unsplash

Data Components

  • Catalog —consists of Products, Services, Plans, Tiers and Pricing Data
  • Accounts Master — has Account specific data like state(Active, Cancelled, Suspended etc.), any payment details.
  • Subscriptions— Product and Services associated with an Account

Code Components

  • Billing Engine — Charging and Invoice generation job
  • Tax Service— Supports quoting and recording taxes
  • Payments — receives Invoice payments
  • Dunning — Cancels overdue Accounts
  • Reporting & Analytics — Various Business Metrics like MRR, ARPU, GAAP etc.

When a customer creates an account, they go through the following steps:

  1. Selects the required Products and Services
  2. Enters the Payment information

Once they complete the process

  1. Price Quote, including Tax based on the customer’s location is shown
  2. Subscription is created based on the customer’s selection
  3. Billing Engine runs and charges the customer
  4. The Invoice is then marked as paid, and the customer is sent a payment notification

Billing Engine, a simple batch job, runs every month to charge the customer accounts that are due for billing. In case of a billing failure we mark the invoice as overdue, and notify the customer to update their payment information. After a certain grace period if the customer failed to pay the account is then cancelled.

How can such a simple system become complex?

As a system evolves, its complexity increases unless work is done to maintain or reduce it — one of the Eight Laws of Software Evolution

Like any typical billing solution, our system just did 3 tasks:

  1. Generating invoice
  2. Charging/Accepting payments
  3. Notifying Overdue Accounts and start dunning.

As the business evolved the system was expected to support various taxation rules with internal expansion, multiple currencies, complex invoices, product bundling, freemium, usage-based billing, one time charges, stacked discounts, promotions, credits/charge-backs/refunds, PCI/GDPR and other compliance requirements.

Now we hit a fork in the road whether to continue evolving the homegrown billing system or integrate with the subscription billing provider. After a meticulous Gap Analysis — based on built-in features, cost, timelines we concluded that enhancing the homegrown solution was a better fit.

A view of our Current Billing System Architecture & Issues:

Here is how our Billing System looked before the re-architecture:

These were the major issues we had in our system:

  • Tight Coupling between Core Infrastructure and Billing Components — some of the components were touchy and developers feared to put in code changes.
  • With the Monthly billing model, there was always a peak load during the start of the month. This resulted in System Scalability issues and a sudden spike in Support Call Volumes. As soon as the billing notification is sent to customers they contact support to make payment or to clarify any charges in the invoice — billing inquiries accounted for 21% of overall support volume.
  • All cons of any Monolithic System — mainly agility and efficiency.

How we solved it ?

To support the growth of the business the On-Premise Billing System had to be moved to the Cloud and to add to mix, the billing model had to be changed from monthly to anniversary with a new price change — all without causing any downtime to the customers and internal business users. This is a pretty daunting goal.

We then identified what needs to be migrated, which was mostly

  • Account data, including payment information, address etc.
  • Active Subscriptions

The goal of our billing system was to offer better scalability and cut the spikes in workload as much as possible.

Risk/Challenges:

  1. Changes to the internal billing architecture should be transparent to client apps including users, and with zero downtime.
  2. Supporting both old and new flows during the transition period.
  3. Keeping the Churn minimum when migrating customers from Legacy plans to Current plans and when switching the billing dates.

Design/Business choices that affected our implementation:

Here are some key business decisions that needs to be:

  • Monthly billing vs Anniversary Billing
  • Tiered pricing vs Volume pricing

Approach

  1. Code Cleanup — Eliminate unused code and components, reduce code duplication and move them into a reusable component.
  2. Organizing the monolith into modules that are pretty close to what would end up as individual services, this would result in better code sharing and easy maintainability of both the code flows.
  3. Ramping up the test coverage, increased level of integration and performance testing.

This is the representation of the system post refactoring:

Production Deployment

Photo by Gianni Zanato on Unsplash

Incremental changes were rolled out in phases to avoid major regression.

Phase 1: New accounts were created in the new billing system only

Phase 2: Manually migrate accounts during subscription changes, this either be during Account Growth (sales team periodically reaches out to customer accounts with a good standing balance to up-sell products) or when the customer initiated a plan change on his own.

Phase 3: Automated migration of existing customers

Final Note:

Our migration was mostly smooth, although there were a few surprises even after such a deliberate planning. These changes resulted in improved productivity and engineering efficiency — no more down-times and monkey patching the code to support new features.

Image Source

On the business side, the support call volume dropped by 15%, the customer churn reduced by 5%, and the billing NPS improved by 20 points .

Now that the billing system is re-architected we are continuously working to improve the efficiency of ever-changing business needs. It’s time for you to access the efficiency of your back-end systems now.

--

--

Velu Alagianambi
Velu Alagianambi

I’m passionate about building high-performing teams and cultures. Engineering Manager @ Atlassian | Mentor @ Plato