How we built a probability of default model without default labels

Published in

Brex Tech Blog

8 min readJun 16, 2022

Introduction

Brex launched the first corporate card for startups to address a woefully underserved market. Despite raising millions of dollars in cash, startups were unable to obtain corporate cards without personal guarantees. The result was personal liability for corporate debt, limits meant for personal use that were too low for a business, or even outright denial if founders had insufficient US credit history.

Recognizing that these startups clearly had the ability to repay large limits based on their cash position, Brex developed an underwriting policy that was both simple and innovative. Customers could connect their bank account(s) to Brex and receive a credit limit between 5 and 20% of their cash balance, depending on risk factors.

Since most startups eventually run out of cash, we recomputed this limit daily to ensure our customers could continue to afford the limit provided. This dynamic underwriting allowed Brex to maintain extremely low losses on a portfolio that most creditors would consider very risky.

While limits were large enough that daily fluctuations were not an issue for most of our customers, for others not knowing what their credit limit would be tomorrow was problematic. Additionally, while cash is a good proxy for risk in venture-backed startups, it can also be a noisy one, leading to unnecessary limit fluctuations.

An obvious data science solution for more precise risk assessment is to develop a classification model to predict the probability that a customer will repay or default. However, training such a model requires defaults, something which we fortunately had very few of. In this post, we describe how we got around this issue by building a “structural PD model” that forecasts future customer cash balance. This model became the centerpiece for a new credit policy that had all the benefits of dynamic underwriting while significantly stabilizing limits.

Stabilizing Limits using Balance Forecasting

The original credit policy, based on current cash balance, provided limits Brex was confident the customer could repay today. However, there was no guarantee that the customer could repay that amount when it was due at statement close. Thus, we crudely adjusted the credit limit every day to ensure that the customer could continue to repay.

Instead, if we could perfectly predict a customer’s balance at statement close, we could set the limit based on that amount, ensuring that the customer had adequate cash to repay the amount owed. Furthermore, we could do this at statement open and remove all fluctuations during the statement period.

While we did not have default data, we were able to access two years of transaction history when a customer connected their bank account. This meant we could construct thousands of daily balance time series — one for each customer and a sizable amount of data with which to train forecasting models.

Of course, even with infinite amounts of data, it is not possible to perfectly predict future balances. Therefore, our new policy relied not only on estimated future balances, but also on prediction intervals. The prediction intervals allowed us to provide limits that we were statistically confident customers would have in cash at statement close. We called this amount the “confidence-adjusted” balance forecast. If the 97.5% confidence-adjusted balance forecast was $100,000, then the model predicts a 97.5% probability that the future balance will be at least $100,000.

On the first day of the statement, the credit policy assigned the limit as a percentage of the confidence-adjusted forecast. Then, on each day through the statement, the forecast was recomputed. As long as the expected cash balance at statement end stayed within the bounds of the initial prediction interval, the limit would not change. If the forecast moved outside the initial prediction interval, the limit was updated.

For example, suppose that it is the first day of a new statement for Acme Corp. The end-of-statement forecast is $125,000, and the 95% prediction interval is $100,000 to $150,000 (i.e., the model estimates with 95% probability that the customer’s end of statement balance will be between $100,000 and $150,000). Therefore, the 97.5% confidence-adjusted end of statement forecast is $100,000 (i.e., the model estimates with 97.5% probability that the end of statement balance will be at least $100,000), and we provide a $50,000 credit limit, 50% of the confidence-adjusted forecast as a buffer.

Limit is set to 50% of $100,000, the confidence-adjusted forecast

The next day, we pull fresh data and the forecast changes slightly to $115,000. However, since the forecast is still within the initial 95% prediction interval of $100,000 to $150,000, the limit remains at $50,000.

Forecast has dropped from $125,000 to $115,000, but
remains within the initial prediction interval so limit
doesn’t change

Now suppose on the 16th day something unexpected happens and the forecast drops to $80,000. Then we would drop the limit to 50% of the new confidence-adjusted forecast. Similarly, if the forecast increased to $175,000, the limit would also increase.

Forecast has dropped to $80,000, which is outside the initial prediction interval. The limit is dropped to 50% of the new confidence adjusted forecast.

To summarize, we set the limit at a level that we are statistically confident that the customer can repay when they need to repay. Moving forward, we only change the limit if something truly unexpected happens. In this way, we continue to react to new information in real time and reap the benefits of dynamic underwriting, while providing significantly more stable limits.

Results

The new policy reduced limit drops by ~80%. Moreover, this was accomplished without increasing risk or reducing customer spend, as verified through A/B testing. Additionally, limit increases outnumbered limit decreases.

This was not an unexpected outcome as, all things equal, limits will tend to increase through the course of the statement. As we progress through the statement period, the forecast horizon decreases as we approach the statement end. A shorter forecast horizon means greater certainty in the outcome and a narrower prediction interval, which then implies a higher confidence-adjusted forecast and limit.

Additionally, since the used balance increases monotonically through the statement, this also implies that, all things equal, customers will have a higher and higher limit as they need it more and more. In other words, customers will have a higher limit as we become more and more certain that they will have the necessary funds to repay at repayment time (i.e., become less risky) and as they need it due to an increasing used balance. In contrast, the original policy tended to decrease limits over the statement period as customers burn cash, setting them up for disappointment when they are not actually able to spend the amount they thought they could at the start of the statement.

At the end of the day, we rolled out balance forecasting to ~65% of customers. The model’s forecasting accuracy was poor for the remaining ~35%. We likely could have increased coverage by continuing to invest in the model, but we eventually had sufficient delinquency labels to replace it with a more traditional classification-based approach for risk assessment.

Balance Forecasting as a Formal Risk Model

Since inception, the central tenet of the startup underwriting policy has been that cash balance is a strong proxy of default risk for startup customers. This assumption is reasonable since cash is likely the primary liquid asset for startups, and Brex automatically draws from the customer’s bank account at statement close. Thus, if the cash balance at statement close is greater than the used balance, we receive the funds that we are owed. Conversely, if the cash balance is not greater than the used balance, the customer will almost definitely become delinquent.

In addition to stabilizing limits, the balance forecasting policy established a quantitative, model-driven framework around this central tenet. By making the “structural assumption” that default occurs if and only if a customer does not have the cash to repay at statement close, we can use the distribution of future balance to get the probability of default. The probability of default becomes the probability that the future balance will be lower than the amount owed.

Thus, we can quantitatively measure the impact of various risk factors and automatically adjust the limit to maintain a target risk level across our portfolio. In contrast, the original credit policy had to specify individual rules that adjusted the limit for each risk factor without a quantitative estimate of the risk implications.

For example, if the connection to the bank account breaks, the original policy specified thresholds at which to decrease the limit as the bank data became increasingly “stale”. In contrast, the balance forecast policy precisely measured the decrease in accuracy of the end-of-statement global balance forecast due to the increased forecast length required by a stale bank account reading. It then automatically adjusted the limit downwards to correctly account for this increased risk.

Of course, this same logic can be applied to any risk factor. Prior to balance forecasting, adjustments were made on a case-by-case basis as new risk factors were discovered . After balance forecasting, new risk factors could simply be added as inputs to the forecasting model.

Validity of the Structural Assumption

The implication that our balance forecasting model can be used as a probability of default model rests on the structural assumption that customers default if and only if they do not have sufficient cash to cover the amount owed at the time of repayment. Once we had sufficient delinquency data, we tested this assumption.

Specifically, we compared delinquent customers’ cash balance at the time of failed payment with those of non-delinquent customers at a sampled snapshot of due dates. To normalize over companies with different repayment obligations at Brex, we divided cash balance by the Brex payment amount. The box plot below shows that the cash to amount owed ratio does not perfectly predict delinquency but it is a very strong predictor. In particular, the cash to amount owed ratio at repayment time is significantly lower and less dispersed for delinquent customers than non-delinquent customers.

Therefore, while the structural assumption isn’t perfect, it was a reasonable assumption that enabled us to build a “PD model” without default labels.

Conclusion

Eventually, we did gather enough data to train a more traditional classification-based PD model, which is currently in use today. Like our balance forecasting model, this PD model determines both the initial limit and uses prediction intervals to determine if/when to change the limit. Indeed, this dynamic underwriting framework was the subject of a recently submitted patent application. While structural modeling is no longer needed for our corporate card for startups, its ability to build risk models without default labels has continued to prove invaluable as Brex rapidly ships new products and expands into new industries.

This article was co-authored by:

Bryant Chen is a staff data scientist at Brex. Prior to Brex, he was a member of the research staff at IBM. He holds a Ph.D. in computer science from UCLA and a B.A. in math and economics from the University of Chicago.

Lillian Xu is a senior data scientist at Brex. Prior to Brex, she was a data scientist at Moody’s Analytics. She holds Masters degrees in both Financial Engineering and Operations Research from U.C. Berkeley and a B.S. in math from the National University of Singapore.

Special thanks to Luis Crouch, data science manager at Brex, for careful editing and great suggestions on this article and Tony Ren, engineering manager at Brex, for all the late nights we spent together working to ship this thing.