Report: On Scalability (Part Two)
Measuring scalability success
By: Matthew Sinclair, Partner and Vice President of Engineering at BCGDV
In the second of a two-part series, BCG Digital Ventures Partner and VP of Engineering Matthew Sinclair argues that the key differentiator of the best digital platforms is a next-generation user experience combined with an API-first mentality. He also explores a range of metrics to measure scalability success, spanning commercial, tech and engineering.
In the first article of this two-parter series, I explored the three different kinds of scalability and the three domains where a digital business might deploy its scalability efforts: people, platform and client. I concluded that the most significant gains from a scalable client engagement process are found at the boundary of customization and bespoke development.
This is because the vast majority of the functionality of complex systems end up being table stakes. From the end customer’s perspective, capabilities like onboarding, authentication and authorization, Know Your Customer/Anti-Money Laundering, portfolio and account origination, trading, tax and regulatory reporting, access controls, and advice are necessary for any customer-facing B2B or B2C SaaS. However, what can differentiate a platform is its user experience. For a client to deliver a next-generation user experience, they need to innovate on the front-end in terms of look and feel and product features while quickly and safely deploying those changes to users.
The best way to enable this kind of client-led, user experience development capability is to ensure that the digital platform has a robust and full-featured API surface that programmatically exposes all of the platform’s capabilities. The very best vendors (such as Stripe, Alpaca, or Plaid) index very heavily on developer experience because they know that the best way to enable a differentiating user experience is to provide a great API and then simply let developers build against it.
Apart from unlocking genuinely differentiating user experiences, there is one other place where an API-first digital platform can make a huge difference: competitive evaluations. Clients that implement systems built on top of core-banking-as-a-service or wealth-management-as-a-service platforms invariably want to execute a competitive tender process where they evaluate a long list of vendors that are cut down to a short-list in an RFP-style process. When a vendor can quickly and effortlessly stand up an indicative sandbox that a client can work with to run a tech spike, confidence in that vendor’s capabilities increases massively. However, suppose a vendor equivocates and makes building a spike or prototype against their offering challenging. In that case, it is much more difficult to build confidence about the platform’s capabilities for the proper implementation.
The principle of an API-led or API-first digital platform that operates headlessly is something that the best vendors know and do well but that weaker vendors tend to ignore. As an increasing number of clients start to go beyond table stakes functionality and look to user experience as a differentiating factor in their offerings, API-first SaaS offerings will become more and more attractive.
How do we measure scalability success?
The best way to gain insights into the scalability of a platform is to measure its performance over time using industry-standard metrics. If we consider performance from the perspective of the business attributes of the platform, then we can use well-known commercial metrics such as:
- Customer acquisition cost (CAC): measures the average cost to gain one additional customer. Minimizing CAC is one of the best ways to increase SaaS profitability, regardless of industry or stage of growth.
- Revenue retention: the revenue generated from the previous month’s (or year’s) customers. Two important calculations to measure churn are by way of revenue retention: net revenue retention (NRR) and gross revenue retention (GRR).
- Customer lifetime value (LTV): represents the total amount of revenue, on average, a company expects to earn per customer over their entire relationship. LTV estimates the full value of an average customer over their lifetime.
- Annual contract value (ACV): represents the average annual contract value of a customer’s subscription. This metric helps measure yearly or multi-year subscription plans.
- Recurring revenue: represents the amount of total recurring, subscription-based revenue that is likely to continue. There are several metrics used to gain insights for the core growth of the business, including monthly recurring revenue (MRR), annual recurring revenue (ARR), committed monthly recurring revenue (CMRR), and average revenue per customer (ARPC).
Equally, there are well-defined industry standards for measuring the platform’s technology and engineering metrics. We can separate these into measures across the three stacks (collaboration, build, and runtime) as follows:
#1: Collaboration Stack
- Time to independent productivity: How long, on average, does it take to onboard a new product manager, engineer, or designer so that they can be independently productive in their team?
- Time to first production commit: How long, on average, does it take for a new engineer to commit code and have it running in production?
2. PEDling (Product/Engineering/Design workflow)
- Enthusiasm: Qualitatively assessed, how enthusiastic is the team about their work and workload?
- Retrospectives: How well is the team doing at conducting and managing retros? Some key metrics to consider: items called out in the retro; items that the team committed to address; items rectified by the end of the sprint
- Communications: Qualitatively assessed, how well is the team communicating both within and beyond its boundaries?
- Learning: How well is the team learning over time? Qualitatively assessed by team members setting learning objectives and then following if they are achieved
- Meetings: How effective are meetings, and how happy are people to attend and contribute? Consider polling meeting attendees for a 1–10 rating (NPS style) and track ratings over time
#2: Build Stack
1. Feature Velocity (or path to production): How long does a feature take from inception (either customer/external or product/internal) to be running in production? This is measured in terms of time in each backlog phase or time from inception to deployment
2. Predictability: How much unplanned work takes up time in the production schedule? This is measured as the ratio of unplanned to planned activity
3. Quality: What percentage of the code base is covered by automated tests?
#3: Runtime Stack
1. Availability: What is the uptime of the overall customer-facing system? This is measured as the ratio of actual availability to expected availability
2. Mean time between failure (MTBF): what is the average time a component functions before failing and needs to be repaired?
- Mean time to repair (MTTR): what is the average time required to fix a failed component and return it to production?
- Mean time to fail (MTTD): what is the average time a non-repairable component functions before failing and needs to be replaced?
- As seen through the percentage uptime “nines”: 2 nines: 99%, 3 nines: 99.9%, … 5 nines: 99.999%
3. Throughput: How many (or how often) does the system complete a business-level task or use case? This is measured as business transactions per second
4. Response: How long does it take for a typical request to be serviced end-to-end? Also: latency. This is measured in milliseconds end-to-end (or “glass to glass”); the time from when the request leaves the user’s client until the system provides a response
5. Elasticity: How effective is the platform at managing capacity demands with respect to capacity constraints and what is the ratio of demand to capacity?
- The platform is over-provisioned when demand is less than capacity
- The platform is under-provisioned when demand is greater than capacity
6. Security: How many security threat incidents have occurred per quarter at each level of severity?
7. Servicing: How many customer service requests does the team receive per week? And, how long does it take for a customer service request to be acknowledged, triaged, scheduled, actioned, and completed?
8. Customer: How much do platform operating costs work out on a per-customer basis?
In conclusion, scaling a digital platform involves much more than just scaling its technological components. Scaling the people and the client engagement process is equally important. Good companies know that they need a modern, cloud-native technology stack with a modular architecture designed for continuous deployment and operability. Great companies know how to combine those attributes with an API-first mentality that lets clients build their own differentiating user experiences on top of an entirely headless infrastructure.
However, the very best companies add to their digital platform capabilities a suite of next generation ways of working that focus on experimentation, failure fitness, learning loops, and the psychological safety of their teams.
When all of these things come together, the digital platform is truly scaling across all possible dimensions.