Introducing Project Levon

Building Maker’s Economic Risk Engine

Block Analitica

Published in

Block Analitica

16 min readMay 25, 2023

This article was written by Jan Osolnik, Rodrigo Lugo Frias, and Žiga Nagelj.

Introduction

Since the creation of smart contracts and the rise of Decentralized Finance (DeFi), the interaction between lenders and borrowers has been automatized in an open, transparent, and frictionless way. Therefore Quantitative Risk Assessment of the users interacting with the smart contracts is a clear direction to follow to make the ecosystem robust and capital efficient.

One of the biggest challenges facing the industry is the lack of a cohesive risk assessment framework that enables timely measurement and management of risk. This is evident when looking at the many different ways lending protocols liquidate under-collateralized positions. These differences complicate the design of a standardized framework.

However, these challenges were also faced by TradFi actors before and we can learn from them. Notably, in traditional lending, underwriters make assessments of prospective borrowers considering historical information, borrowing capacity, backup capital, and forward-looking view of the borrowers’ Probability of Default (PD). As a result, Credit Scores were developed to guide the underlying decisions with objective statistical information. Notable examples are the FICO Credit Scores and Experian. In web3, some protocols have made important contributions to address this issue, such as Spectral and Cred Protocol, among others.

The aim of Project Levon goes beyond assessing the creditworthiness of borrowers. The goal is to build a DeFi Credit Risk Framework for lenders and borrowers by providing an unbiased credit grading methodology, depending solely on public on-chain data. In this way, grades depend on borrowers’ Monitoring Behavior, Transaction History, Collateralization Ratio Protection, Liquidation History, Current Holdings, and Balance Sheet.

DeFi Credit Risk Framework

The assessment of credit risk for Maker borrowers is carried out based on data and advanced analytics flowing from on-chain sources and being processed by the Levon Risk Engine. It is a systematic process involving the identification and formulation of quantitative evaluation of risks.

In our previous work, we did a deep dive into our ML model, and we used it to predict the probability of vault liquidation. We presented the model, the methodology, the performance evaluation, and some exploratory data analysis of our observations.

This post extends the initial work into a product called Levon Grade with the aim of expanding it across other DeFi lending protocols. It assesses the risk of default of each wallet address that currently has an active vault open at Maker. We leverage public on-chain data for building our grades, and all data is processed off-chain by our Levon Risk Engine. We explore the improved methodology, present some new findings from our analysis and conclude with potential business use cases and future work.

System diagram of Project Levon applied to the MakerDAO system

Our methodology is explained in detail below, but the diagram above shows what is the workflow that we followed: from a feedback loop, deriving insights from Vault Protection Score analysis, Feature Engineering from our previous work to the development of a Wallet-Level ML Model that determines the Probability of Default of borrowers in a simulated market crash. Finally, we apply common TradFi risk assessment techniques and introduce two concepts that are readily available on-chain — credit mix and externally held capital — to calibrate, rescale and summarize our work creating a unique Credit Grade for Maker borrowers.

Levon Grade in Public Beta

The credit scoring product is available today in an public beta version, You’re invited to interact with it by entering different wallet addresses with an active vault at Maker. We also prepared some example addresses to reduce the friction to insight. Additionally, the product not only outputs the credit grade but also returns the explanation behind the grade through a high-level evaluation of the factor contribution. The computed Levon grade ranges from 1 (very high risk) to 10 (very low risk). This is available in the Risk Assessment Report tab. There is also an available exploratory data analysis in the Analysis section which we dive deeper into below.

We’re sharing an example Levon grade computed for a sample wallet with the explained factor contribution. The wallet is assigned a low risk (credit grade 8) which is similarly graded to the majority of wallets using Maker vault services. This reflects the debt at risk chart showing the high collateralization ratio maintained by Maker vault owners.

The Factor Contribution section below shows which factors (signals) have the most impact on the assigned credit grade. We can see that the wallet has a high collateralization buffer (collateralization ratio over liquidation ratio) and also a good credit mix (riskiness of the position’s collateral).

Sample wallet’s explained contribution of different factors (signals)

Methodology

Most credit scoring models are developed using methodologies that use specific predictors to estimate the probability of default. These designs rely on descriptive means for calculating conditional probabilities, leveraging models such as binary logistic regression and classification trees. The accuracy of these models determines the quality of the results.

We approach borrowers’ credit risk based on: Activity, Protection, and Balance Sheet.

Activity refers to the borrowers’ transaction history. It includes Acquired Debt, Credit Mix, Number of Borrows, and Average Length of Credit.
Protection alludes to the various behavioral patterns that can be derived as a signal to understand if the borrower was (or not) actively monitoring their health. We take into account past events like Liquidation History or Repayment History and the usage of automated liquidation protection services (e.g. DeFi Saver).
Balance Sheet refers to the current amount of assets, liabilities, and equities linked to the borrower. Here we include features like Externally Held Capital, Outstanding Debt, Current Debt to Collateral, and Net Liquid Assets.

It is important to note that within these categories it is easy to draw a bigger, and more complete list of individual features that can be used to shape our grading methodology. Also, by analyzing Maker we are focusing on a credit portfolio with an overall good quality of borrowers, as shown by Maker’s debt at risk (as mentioned above).

Feature Engineering

There are different types of variables that are being considered when creating the datasets used for our analysis. In general, they can be classified into two groups: Categorical (e.g. Credit Mix) and Numerical (e.g. Collateral to Debt).

Feature engineering and selection pursue several objectives the reduction of the number of explanatory variables and the simplification of our final model. Therefore, data preprocessing and classification play a central role in our approach; paying close attention to feature engineering, transformation and scaling, and finally, feature selection. Some of the features that the model leverages include:

active management
collateralization buffer
around-the-clock interactions
repayment activity
tenure (time since wallet’s first vault creation)
externally held capital
credit mix
debt amount
recent activity
historical liquidations

After the data has been preprocessed, we run an analysis to spot possible issues. We follow a systematic approach where the current state of open vaults is managed by a borrower and its historical behavior is quantified.

Model

Under current market conditions, our model simulates a sudden market crash comparable to the worst historical price drops and estimates the probability of the borrower’s vaults being liquidated. By doing so, we effectively measure the likelihood that the borrower will fail to pay back their debt (Probability of Default) using the probability of liquidation as a proxy.

Using vault liquidation as a proxy allows us to encapsulate the problem as binary classification since there are only two possible outcomes. Thus, through detailed feature engineering and a well-calibrated supervised machine learning model, we optimize for performance on out-of-sample data.

Even though we experimented with a range of models, we decided to use gradient boosting methods since they are widely considered a good option for both performance and scalability. For evaluation, we used a train-test split with k-fold cross-validation. The model’s performance scores are an F1 Score of 0.71 and a ROC AUC Score of 97.

Calibration

We extend our base model by scaling our results to overall central tendencies, i.e., we standardize the resulting distribution to obtain precise estimates of the probability of default of borrowers. This is achieved by two means: on the one hand, we scale the PD curve within a range of confidence, and on the other, we use externally held capital and credit mix to broadly assess borrowers’ credit risk.

Rescaling

By simulating a market crash we estimate the likelihood that borrowers’ vaults are liquidated under current market conditions. We then use this distribution as a proxy for their Probability of Default.

In the figure above we show the outcome of rescaling and standardizing our results on a scale from 0 to 100. It can be seen that our initial approach correctly captures the essence of the credit rating system. However, it fails to distribute borrowers in buckets that clearly indicate how risky they are. Thus it is necessary to further calibrate our ML with some other features that will help us better assess borrowers’ risk profiles.

Risk fine-tuning

We also leverage TradFi methodology and use two concepts that are readily available on-chain: credit mix and externally held capital which provide additional insight into the likelihood of liquidation. As part of our methodology, they are defined as follows:

A borrower’s address can be associated with different vaults and different underlying assets, thus, credit mix refers to the different types of credit accounts that the borrower has. In principle, what differentiates one particular vault from another, other than the underlying, are their risk parameters. By evaluating the borrower’s credit mix based on a qualitative assessment of Maker Vaults, we created an adjustment on the grade produced by the base model.
On the other hand, externally held capital refers to the amount of available capital outside of Maker that the borrower has. It is possible that a borrower has a considerable amount of capital distributed in a basket of tokens and positions in other protocols, like Aave and Compound. However, the availability of these tokens to repay debt depends on their liquidity in the open market, which is why, to account for the liquidity risk, the nominal amount of this capital is adjusted to slippage.

With these definitions in mind, we build signals for both credit mix and externally held capital and use them to adjust the base credit score model. These can either reward or penalize the final credit grade.

The final results of our methodology are summarized in the chart below where we show the final Credit Grade of the Maker’s user. To enhance interpretability we have decided to scale our scoring from 1 to 10 (1 being the worst possible score and 10 the best).

Credit scores could vary according to the scoring model used and the platform that is being assessed. However, we should note that with our methodology, we created a systematic approach that can be reproduced and transferred to other lending protocols and platforms.

Analysis

We did a deep dive into the model inputs (features), model output (target — credit grade), segmentation analysis of wallets based on feature values, and also a specific focus on external capital as a key component in our adjustment of the credit grade.

External Capital Analysis analyzes wallets’ assets held capital outside of Maker, both in the wallet and also across different protocols. In our models, we calibrated the grades differently based on the location of these assets, either held idly in their wallets or “actively” in other protocols.

Addresses with an active Maker vault hold about $600m in their wallets and $700m in external protocols. This includes only Ethereum L1. In total, that’s 40% above the current Maker’s risky debt (around $1B), ignoring asset slippage that would be incurred during a potential swap aimed at vault protection. Meanwhile, in our methodology, we take into account asset amounts after slippage.

Each section provides a unique insight into various aspects of the methodology to assess individual wallet risk, and portfolio risk and also do a sanity check on the data value chain. The charts are also available in our product.

Model Analysis

In the analysis, we cover several interesting features to gain some insights from the visualizations.

When the snapshot was taken (May 25th, 2023) there were around 1600 active wallets. Wallets are mostly safe (between credit grades 8 and 10) with some exceptions on the riskier side.

One of the features in the model is the Collateralization Ratio Buffer (collateralization ratio over liquidation ratio). It is based on how far current debt-weighted wallet collateralization across different vaults is from the liquidation ratios of their vault types.

In the chart below we can see that most wallets are well protected with about 70% of wallets being safe against a drop of at least 50% (Collateralization Buffer of 2).

The number of months since the wallet opened its first vault is another model input.

Intuitively, the longer a wallet has been active at Maker, the more experience the vault owner has in managing their vaults which can positively contribute to its likelihood of protecting itself against liquidation. While there is a pattern of active vaults more likely being newer, there are still a lot of long-standing wallets still active.

While it’s clear that there is still a strong inflow of new vaults, there is also a clear pattern of good retention of older wallets (higher tenure).

Tenure of wallets with a currently active vault

As market shocks can happen anytime during the 24h time-frame we created a feature that also captures whether a wallet has its vault interactions spread across different time zones. Intuitively, the more spread around the 24 hours the events are, the less likely any of the wallet vaults are to experience a liquidation. Around 10% of wallets show a strong pattern of interacting with their vaults throughout the 24h span.

Recent activity can be informative in how likely a vault owner is to respond to a market shock. About 30% of wallets were active in the last month with 50% of wallets being active in the last 100 days.

Segmentation Analysis

Additionally, with the created feature set we performed segmentation of wallets by using k-means clustering. K-means algorithm aims to optimally differentiate individual instances (wallets) into a pre-defined number of clusters (wallet groups) based on the feature inputs.

The output clusters that were returned by the algorithm were mapped to an intuitively understandable set of segments.

These segments are mostly self-explanatory and include Automated, High CR Buffer, Active, Inactive, Early Adopters, Protected, Whales, and Liquidated Wallets. Some of the wallets that didn’t clearly fall into any segments were not assigned any segment. Wallets can also be assigned multiple categories, hence the wallet counts are higher than all active wallets.

In the chart below we can see that the higher credit grades tend to have a larger percentage of wallets with a high Collateralization Buffer, tend to be early adopters (high tenure), and are often highly active. Also, they are more likely to be protected by services like DeFi Saver, which we aim to quantify with our methodology.

Wallet count per Segment and Credit Grade

Segment debt per credit grade below shows a similar pattern as does the number of wallets above with the added insight that most whales are assigned a relatively high grade with the debt graded with a 10 being more likely to be using automation protection services.

Wallet Debt per Segment and Credit Grade

External Capital Analysis

Asset amount held in the wallets of active vault owners has been moving around $1B from February till mid-March 2023. Since then there has been a drop to $600M. The drop was mostly caused by removing the wallet exposure that holds Polygon’s treasury in MATIC.

As the necessary amount to protect a position is proportional to its DAI exposure, we also look at external capital held relative to its debt. Intuitively, the higher the ratio (more external capital relative to debt), the higher the price drop of collateral can the vault experience and still have enough capital to protect it against liquidations.

When looking at total external capital (both passive in a wallet and active in other protocols) we see a healthy pattern with 75% of wallets having at least as much external capital as they have minted debt (ratio = 1). 10% of wallets have more than 5x external capital relative to their debt.

Passive and Active External Capital to Debt Ratio — ECDF

As mentioned above, around $700m is held in external protocols (only including Ethereum L1). Instead of TVL, net asset amount is used to better evaluate capital potentially available for vault protection. When splitting into relevant protocol categories we notice that Morpho-Aave, Compound v3, and Aave v2 dominate the lending space. In staking, there is Lido, Aura Finance, and Convex. In the AMM space, most of the capital is deployed to Uniswap v3 contracts.

Similar to Maker’s DAI debt exposure which follows a power-law distribution (a fraction of wallets minting the majority of debt), we can see a similar pattern in external capital.

The largest 5 wallets by external capital hold 50% of the total amount while the top 50 wallets hold about 75% of the total.

Future work

The current implementation of the Levon Grade could be leveraged in multiple ways which are also likely to evolve over time. Also, there are a lot of potential extensions that could increase its added value andits number of use cases.

Integration

Examples of potential integrations include creating a whitelist of wallet addresses that have historically been successfully stress-tested during major market shocks without a liquidation. These could have access to a specific, riskier vault type.

It could also be used to provide preferential treatment (eg. lower stability fee, lower liquidation ratio, slower liquidations, etc.) for selected wallet addresses. Meanwhile, this is currently not possible in the Maker smart contract setup.

The key goal of building this product is to create an automated quantitative risk assessment on a wallet address level with a strong potential of being integrated into different protocols.

Implementation and/or integration depend on each use case with various options for further productionization. Currently, there are various (evolving) options. ComposeDB was launched on Ceramic Mainnet which aims to cover data-intensive use cases which could also include credit scoring. The product itself could be directly used to generate wallet-specific credit grades as Soulbound or attestations (verifiable credentials), depending on the design decision. There is also a rapidly emerging paradigm of compute-to-data, implemented by organizations such as Protocol Labs (Bacalhau). Finally, Ocean Protocol enables the monetization of ML models in a privacy-preserving way.

Extension

As there is increasing adoption of lending protocols based on cross-collateralization mechanics (Aave, Compound, Spark), the natural extension of the current methodology is to apply it to that protocol design. That also enables a further move towards a DeFi-wide Levon Grade.

For that reason, we are currently working on a more generalized solution. Below we share some charts when applying our methodology to Aave Protocol v2 on Ethereum Mainnet (currently still the largest Aave version by TVL).

After building the feature set on the historical data of Aave liquidation behavior, we can see the feature (signal) importance of a basic feature set. We will continue expanding this to improve the model performance.

The most important signals in the model for predicting whether a wallet is likely to be liquidated during a market shock are its health factor, total borrowed amount, credit mix, and total recent event amount. Intuitively, the health factor before the price drop is the most important signal.

Aave wallets’ feature (signal) importance on the probability of liquidations (a proxy for default)

Given the partial liquidation logic of cross-collateral lending protocols, it’s interesting to see what percentage of collateral is liquidated when liquidations do occur. Same as with Maker, we are looking at some of the largest price shocks and their impact on the Aave v2 Ethereum Mainnet market.

We do that by comparing wallet collateral state 1 day before the price drop and liquidated collateral within 3 days of the price drop. That gives us the insight that in more than 80% of the cases, less than 40% of the wallet’s collateral is liquidated in that timeframe. This supports these protocols in preventing the accrual of bad debt through position insolvency.

Conclusion

In this post, we introduced Project Levon which is in its current version applied to the MakerDAO system. We shared it in a public beta version with its underlying methodology from feature engineering to modeling, calibration, rescaling, and fine-tuning. We expanded with analysis of various components in the methodology to gain insight into the mechanics between inputs and outputs. Finally, we shared some potential use cases that this product could be leveraged for.

We will continue expanding this methodology based on the feedback we receive so we invite everyone interested to share their comments with us.

Future Work

There are multiple avenues to continue expanding on this project, some of these include:

expanding the model to other protocols such as secondary lenders (as mentioned above)
including other on-chain data sources other than the Ethereum blockchain
improving the model inputs (feature engineering)
analysis of historical changes in the individual and portfolio risk
improvement of the UI and/or model integration into other dashboards

About Block Analitica

Block Analitica is a dedicated risk management firm providing extensive, data-driven risk assessment, monitoring, and mitigation services tailored to the unique challenges of the decentralized finance (DeFi) ecosystem. Our expertise empowers stakeholders to navigate the DeFi landscape with confidence, ensuring stability, security, and growth for their protocols and businesses.