Credit Risk assessment through Alternate Data and AI

Sawan Kumar
FinTech 2030
Published in
20 min readJan 11, 2023

Introduction

Almost all contemporary economies are credit based. Credit is an accurate depiction of how contemporary, complex economies work, in which money plays a crucial role as a medium of trade, a tool for payments, and a repository of value. The availability and use of credit aids the production processes, the interchange of goods and services, and the financial services industry’s expanding role in all contemporary economies. A functioning contemporary economy without credit is difficult to imagine. The typical feature of credit is that one economic participant, whether an individual, a corporate entity, or a governmental institution, forgoes the immediate use of purchasing power that is already in her possession in favour of temporarily making it available to another economic participant who needs to use it right away.

This credit not only supports our country’s growth but also our aspirations on a personal level. Have you ever considered getting a vehicle, a house, or enrolling in a prestigious university? These items are not very affordable, and many people are unable to pay with their bank accounts immediately. However, that does not imply that they must be beyond our reach. I am a proponent of utilising credit responsibly to improve one’s life since I have used it a lot myself. Should everyone, however, be able to obtain credit? If so, how much credit should be available to every person? How should that to be decided?

I have made an effort to explain the fundamentals of credit evaluation, the shortcomings of the existing system, and how alternate data and artificial intelligence might help to improve it.

5Cs of Credit

Lenders usually assess 5 C’s before granting a loan or credit. The 5 Cs are Capacity, Capital, Conditions, Character, and Collateral.

5Cs of Credit

Each of the five Cs of credit is measured by the lender in a different way — some qualitatively, some quantitatively, and for certain Cs, they don’t always do calculations. For instance, there is a constant back and forth between the bank and the company when lending money to small and medium-sized businesses, somewhat similar to the bargaining we do with autowalas. However, in essence, these are the 5C’s that banks look at before approving your loan.

  1. Capacity: The ability of the borrower to repay the loan based on the requested amount and terms must be confirmed by the lender. The financial institution examines the organization’s prior cash flow statements when considering a business loan application to assess how much revenue is anticipated from operations. Individual borrowers disclose in-depth information on their income and work stability. The borrower’s capacity is also assessed by comparing the quantity and size of debt obligations that are still due to them with the monthly income or revenue projections. Your ability to repay a new loan will thus be lesser the higher debt you presently have.
    The majority of lenders utilise standardised formulae to decide whether a borrower’s ability is acceptable. For instance, mortgage lenders employ the debt-to-income ratio, which expresses a borrower’s monthly obligations as a proportion of his monthly earnings. Lenders see borrowers with high debt to income ratios as high-risk, which can result in rejection or changed repayment conditions that increase the total cost of the loan or credit line throughout its life.
  2. Capital: Capital is the second. When evaluating a borrower’s creditworthiness, lenders may consider their capital position. Personal investments made into the company, retained earnings, and other assets under the owner’s control make up the capital for a business loan application. Capital for personal loan applications comprises of the balances in savings or investment accounts. Lenders consider capital as a backup plan in case income or revenue is halted while the loan is still being repaid to meet the debt obligation.
    Because a borrower who has a lot of capital has some stake in the outcome, banks favour them. When the borrower contributes their own funds, they feel more invested and are more motivated to repay the loan.
  3. Conditions: Conditions include both the terms of the loan itself and any broader economic circumstances that may have an impact on the borrower. Lenders to businesses consider factors including the strength or weakness of the wider economy and the loan’s purpose. Applications for company loans frequently specify financing for working capital, purchasing equipment, or growth. Imagine, for instance, that an auto parts business has had a successful year and wants to grow. They submit a financing application to start a new manufacturing facility. However, the economy is in decline, and it is anticipated that car sales would decrease in the near future. They would either have their loan denied or charged a higher interest rate. In a weak economy, obtaining a loan might be challenging. The reason for taking on the debt is considered for individual borrowers as well, even though this criterion frequently only applies to corporate applicants. Home improvements, debt relief, or financing large purchases are typical justifications for individual borrowers.
  4. Character: Character comes in at number four. It alludes to a borrower’s track record or reputation in relation to money. A common saying is that previous conduct is the best indicator of future behaviour. This concept is religiously supported by all lenders. Each institution has its own method or formula for judging a borrower’s character, which takes into consideration previous repayments, honesty, and dependability; nonetheless, this evaluation often combines qualitative and quantitative techniques.
    Examining the borrower’s educational background and career history, contacting personal or professional references, and having an in-person interview with the applicant are some of the more subjective factors. Examining the applicant’s credit history or score, which credit reporting organisations standardise to a single scale, are more objective techniques.
  5. Collateral: Collateral is the final. Collateral is the name given to the private property that a borrower pledges as security for a loan. Individual debtors frequently use money, a car, or their house as collateral, while business borrowers may use inventory or accounts receivable to get a loan. Because the lender may seize the asset if the borrower stops making loan payments, secured loan applications are treated more favourably than unsecured loan applications. Banks assess collateral’s worth numerically and perceived ease of liquidation subjectively.

Although each financial institution has its own process for determining a borrower’s creditworthiness, both personal and commercial credit applications frequently include the five Cs of credit. Candidates who score highly in each area are more likely to be offered larger loans with better conditions for repayment as well as reduced interest rates.

Capacity and Character

The majority of study has been done on evaluating capacity and character, and that is the direction this article will head towards. The most significant and popular tool is the Credit Score.

Credit Scores

A credit score is a statistical measurement based on credit history that assesses a consumer’s creditworthiness. Credit scores are used by lenders to determine the likelihood that a borrower will pay back their obligations. Credit scores are determined by the nation’s credit agencies after taking into account a number of variables, including the duration of your credit history, repayment history, and number of credit inquiries, among others.

A better credit score may entitle you to additional benefits when you apply for a credit card or loan from a bank or NBFC, such as a larger credit limit/loan amount, a reduced interest rate, and a longer repayment period.

Everywhere in the globe, different credit ratings are utilised, depending on the country. I have spotlighted two important ones.

Western countries use FICO and CIBIL is used mainly in India. For both scores the criteria are similar the weightage is also very similar in both.

FICO and CIBIL credit scores
  1. Payment History: This captures whether you have been fulfilling your obligations on time like paying your EMIs. This is the largest component in both the scores. If your EMIs go wrong, then your credit score takes a huge hit. If you want to maintain your credit score this is where you should focus the most i.e., Maintaining your repayments.
  2. Credit Exposure: This essentially accounts for how much percentage of the allotted credit you are using. The thumb rule is to keep the credit utilization at 30%. Too high or too low credit utilization is not beneficial for Credit score. Specifically, too high utilization is a red flag.
  3. Length of Credit History: If you have handled credit longer it is an indicator you are experienced and can handle credit well in the future as well.
  4. Credit Mix: It is desirable to have different types of credit. If a person has a mix of instalment credit, secured loans such as car loans or mortgage loans, and unsecured such as credit cards. It is seen as a preferable trait.
  5. New Credit: Every time you apply for a new credit, to decide whether you are a fit applicant, the institution needs a credit score and sends a hard inquiry to the credit bureau to pull your credit score. Each hard enquiry costs you a drop in credit score. So, if you are an undesirable applicant and trying to take credit from multiple sources, you would find your credit score dropping after every inquiry which makes you further undesirable. So, it is advised that when taking credit, do not apply from myriad sources, just go with your best option.

On the other hand, if your credit application is approved your credit score would drop a small amount due to the inquiry but that would be more than compensated by increase in other factors like Credit Mix and when you make timely payments.

Credit Scores around the world

Different institutional cultures exist across the world. Now let us look at examples at how other countries use credit scoring mechanisms

  1. Japan: In Japan, there is no official credit rating system. If you are travelling to Japan, your traditional credit rating won’t matter unless you bank with a foreign institution that is already connected to a Japanese counterpart. Credit in Japan is typically based on factors like length of employment and salary. Speaking Japanese is also a must, without which banks won’t even bother to consider lending you any money.
  2. US, UK & Canada: These nations use a traditional FICO-like credit rating system. There are several organisations in the area that assign credit ratings.
    Experian use a scale of 0–999 points.
    Equifax assigns a score between 0 and 700.
    A scale from one to five is provided by Callcredit.
    Your credit score, In Canada and India Credit score is supplied by Transunion and ranges from 300 to 900.
    There are little distinctions between each of these nations, such as the UK, where registering to vote (or stating your eligibility to do so) might help you raise your credit rating there.
  3. Spain and Netherlands: In Spain and the Netherlands, maintaining “excellent credit” is mainly about avoiding a blacklist for those with negative credit. As a result, if you have defaulted on a debt, you are placed on a blacklist, which makes it challenging to obtain better loan. Consumers who default are placed on a blacklist for five to six years, or until the debt is paid off.
  4. Germany: Germany has a credit system that is fairly advanced. Its primary credit agency is SCHUFA, a private business that runs similarly to credit data brokers in the US and keeps track of open accounts, past-due loans, penalties, and unpaid payments. Your SCHUFA score begins when you rent your first flat, establish a bank account, or make your first utility bill payment if you are a new immigrant to Germany. Everyone’s score begins at 100 and gradually decreases as they accumulate financial history. A credit score in the 90s is regarded as good.
  5. Australia and Brazil: Australia and Brazil are in transition period. Their governments have in past five years started building a credit scoring system.
  6. China: China is, to put it simply, unusual. The Chinese government is in the midst of enforcing a “social credit” rating system. The social credit system looks at every element of a person’s life, going well beyond what can be considered as credit data. Your credit score will be impacted by not paying a bill as well as other offences like having a bad driving record or smoking in a no-smoking area. A poor score might prevent someone from enrolling in a top university, from purchasing train and aeroplane tickets, and even from matching with certain people on dating websites. They are attempting to create a different kind of civilization there, which is extremely unsettling.

With all these factors affecting how different nations assess your value, it is important to consider how important are credit ratings. Or should your social behaviour and other non-credit related characteristics influence your ability to pay?

What’s the problem with traditional credit scores?

While credit scores are used by most financial institutions globally to assess a person’s credit worthiness, they have several limitations. I have outlined a few:

  1. No-hits: Someone who has never had credit will not be scoreable since credit ratings are dependent on past data. Due of this, billions of people worldwide lack access to fair and reasonable credit. Additionally, these demographic groups frequently fall at the “bottom of the pyramid” and have the greatest need of credit.
  2. Short term outlook: Financial firms are aware that the bureau scores have a finite lifespan. In light of this, many people’s scorecards must be updated and re-evaluated over time, or else they will become useless (or non-representative). Covid-19 is a prime illustration of this. Ironically, despite being recently furloughed, many people may still have good bureau credit ratings reflecting their pre-Covid performance, even if they presently pose a bigger credit risk than their bureau scores may suggest.
  3. Score boosting: Credit bureaus’ attempts to increase transparency by publishing their scoring methodology have raised more questions about credit ratings. Although there are numerous benefits to this, it can also lead to people manipulating their scores in an unrepresentative way to seem to be a lower credit risk, which exposes the lender to new kinds of default risks.
  4. Erroneous data: The Brookings Institute claims that the staggering volume of mistakes in credit reports is the primary issue with credit scores. In reality, at least one out of every five individuals have a potentially major inaccuracy that distorts their credit report and makes them appear riskier than they are. Such mistakes may result in higher loan costs or loan rejection.
  5. International migration: In general, Equifax, TransUnion, and Experian are the three global credit bureaus. Since a person’s FICO Scores are based on the information in their credit reports, each credit-reporting bureau may give them a different score. The credit decision may vary depending on which credit bureau the lender uses to obtain credit ratings since various lenders have different requirements for issuing loans. Additionally, it’s crucial to keep in mind that not everyone has good credit. This implies that recent immigrants may have to start over in order to rehabilitate their credit, thereby limiting their immediate access to loans.

In conclusion, credit scores are not perfect, despite the fact that they do in many ways allow organisations to assess people’s credit worthiness and make lending choices appropriately. Furthermore, we see that individuals who suffer the negative effects of credit scores are frequently those who need credit the most, such as immigrants, young people, and residents of rural regions.

Traditional vs Alternate Data

Until recently, lenders generally based their lending choices on the information included in the core credit files maintained by major national credit agencies. These files contained information on a consumer’s loans’ terms and performance, as well as information from various public sources and credit inquiries. Traditional data also includes other categories of information that customers frequently include on loan applications, like their income, the duration of their employment history, and their employer.

Traditional bureau credit scores take into account the last 5 to 7 years of credit payback history, however specific models vary. This system favours certain people while excluding many others. Traditional data makes it impossible for many lenders to evaluate particular persons, preventing them from accessing potential good credit.

Alternative data often refers to data sources that lie outside of that scope. Alternative data is being added more often to help lenders see “thin file” clients when there isn’t enough data or to increase the precision of default prediction.

Let’s look at a few sources of alternative data, and how useful they are for credit decisions

  1. Transaction Data: Typically, this information pertains to consumer credit or debit card usage. Although the majority of lenders already have this data, which is frequently edited into monthly summaries, it may not appear “alternative,” but it isn’t frequently mined to extract the most predictive value. It can be used to generate a wide variety of predictive characteristics, including ratios of cash to total spending over the previous X week(s) or ratios of spending over the previous X week(s) to spending over the previous Y week(s), as well as characteristics based on the volume, frequency, and dollar amount of transactions at various retailer types. AI may also be used to mine this data and produce patterns that are often imperceptible to human eye.
  2. Telecom / Utility / Rental Data: This information is essentially credit history information; however it is alternative information because it is not often included in credit reports. This information may provide insight into a person’s payment-related behaviour. We can get a good indication that someone would be able to manage some extra credit if we can determine whether they pay their usual monthly rent on time, their mobile post-paid payment on time, or their power bill on time.
  3. Social Profile Data: Although it is conceivable to mine Facebook, LinkedIn, Twitter, Instagram, Snapchat, and other social media platforms, few lenders would want to deal with the regulatory challenges of being the first to do so. Although it might be feasible to extract value from metadata instead of from what individuals say in these channels, such as the quantity and frequency of postings or the size of their social network, this would still probably raise privacy concerns, which is why no one expects this to happen any time soon. The value of this data would also be smaller than the value of data with a better credit relationship, despite what some FinTech or optimists may claim. This could still result in a little increase in value, though.
  4. Clickstream Data: It is possible to derive insights from how a consumer navigates your website, where they will click, and how long they stay on a page. This source has the potential to produce a ton of data, but how much of it is helpful for evaluating credit is another matter as this data is not directly related to credit.
  5. Audio and Text Data: This information can be obtained on credit applications, in customer service or collections calls that have been recorded. It can complement “thin files”.
  6. Social Network Analysis: Social Network Analysis is a strong point. We can map a consumer’s network in two significant ways thanks to new technology.
    First, even if the files have slightly different names or addresses, this technique may be used to identify all the files and accounts for a single client. I might not have the same ID on Instagram as I have on my PAN card. You might use social network analysis to compile all of that data into one file, which would help you better understand the client and their profile.
    Secondly, we can determine the person’s relationships with others, including those with members of their family. The credit ratings of the applicant’s network might be a helpful source of information when assessing a new credit applicant with no or little credit history. Let’s say I am going to miss a payment on my credit card, but my mother is doing well, has a high credit score, and manages her finances well. As contrast to someone whose mother or father is a non-earning relative, I may be able to borrow money from my mother and pay off my credit card debt. I should be assigned greater credit compared to the next person.
  7. Psychometric or Survey / Questionnaire Data. Psychometrics is a cutting-edge new method of assessing a person’s credit risk who has little or no credit history. There are tests that may be used to determine a person’s attitude toward making regular payments. This information might supplement other data sources and increase accuracy.

FICO carried a study that show that these data sources do increase accuracy compared to traditional data. If the current relationship with a customer is strong, then considering transaction data would increase the accuracy by 5–10% over and above traditional data sources. By traditional data sources I do not mean credit scores, but sources like income, employment, credit card data and other similar stuff augmented with AI models. FICO carried this study on 6 categories except Psychometric data.

Alternate Data sources

Use Case

Let us understand a use case, where alternate data can help us meet the needs of small vendors who have a healthy business but have been devoid of credit due to credit scores.

Let us consider Rajesh. Rajesh is a grocery vendor. He sets up shop in the city outside a place where a construction project is going on. Construction workers are his primary customers. The workers get lunch and pick up groceries for their families every day, but they pay him at the end of the week when they receive their own weekly wages. Rajesh trusts them, he knows where they work, he knows they have a job, he knows where to find them. He has intermittent but a constant source of cash.

He buys stock from a variety of wholesalers who only accept cash and do not give stock on credit. So, every day he has to have enough cash to stock his stall. But occasionally, he has a very good day and stock runs out just a few hours into the day. Although he is owed a fair amount of cash, he does not have it on hand now. He needs to pay the wholesalers to get the stock and keep the business flowing. Now he does not have the cash to buy the stock needed. His options are:

  1. To go for a conventional bank loan which would take forever and also reject his loan application due to absence of credit history
  2. Go to loan sharks for cash who charge close to 20% interest rate
  3. Wait for the construction workers to pay him, which could be 2 days from now and he would lose all the business in between.

Now in this case if we take CIBIL score, this guy has no repayment history because he has never taken a loan, there is no credit exposure, again because he has never taken a loan, no length of credit history, again because he has never taken a loan, and no balance of credit mix because HE HAS NEVER TAKEN A LOAN. So basically, he won’t qualify for a new loan because he has never taken a loan. So it’s like you can’t take a loan because you have never had a loan… quite ironical.

Here is where we can use alternate data credit models can come to Rajesh’s rescue and supplement his journey into financial credit inclusion. Alternate data takes us through a different route. It can circumvent this cycle by analysing Rajesh’s alternate data. Let us see what the possibilities are

  1. We are not collecting voice or text data because that might be a privacy concern.
  2. We might have some social profile data because he might have a Facebook account. We can gather some data from there.
  3. We might not have psychometric data, because we cannot expect Rajesh to sit for a test and answer psychometric questions in the hour of need.
  4. We might have a lot of clickstream data. Although this has no direct credit connection, the utility can be expected to be low.
  5. Also, utility payments data will be on the lower side.
  6. Social network data is a good one. Everyone uses phones and everybody has their contacts saved. A similar feature like true caller could be used to supplement credit assessment. There is a small correlation between the number of contacts you have on your phone and the credit you are eligible for. The larger the number of contacts, higher the amount you can be eligible for. Further predictions can be made from the data whose number is saved on your phone. If you are a businessman and you have a lot of numbers on your phone this means that you are receiving a lot of payments from a diverse customer base. So social networks can be used as a good alternate source for credit scoring even though there is no direct credit connection.
  7. We have Rajesh’s transaction data because he does a lot of transactions on UPI. The primary source of creditworthiness is the cash flow that is happening with Rajesh. Every week workers are paying their dues and there is a healthy cash flow. He is able to stay net positive every week.

Isn’t this the ideal customer who should qualify for a loan and banks should be lending to. With a hard permission from the customer a bank or an institution can pull such information from the phone which can let the vendor avail working capital loans.

So, where we had no credit profiles, we managed to use alternate data to lend non risky loans to people who need it and more importantly who can pay it back. Win-win for Rajesh and for the bank as well. Once Rajesh pays off this loan, he can begin his loan journey through conventional ways.

Moreover, this AI process is instant, once the system has your data, it can be assessed within minutes to identify your credit limit. So, Rajesh does not have to wait for all the approvals and red tape to get a loan. It can be an instant loan.

Case Study

There is a company in Philippines that is doing exactly this. Pera247. Pera247 came up with a novel technique to develop a scoring system using a smartphone, which the majority of people own nowadays, in order to give loans to those who have little or no access to financial information. Each accessible data point, including applications used and installed, contacts, SMS, and other things that might be indicative of a person’s creditworthiness, is given a score by pera247.

Certain phone data points can forecast an applicant’s chances of repaying a loan on time in the absence of a trustworthy credit bureau history. For example:

  1. Applications: use data related to the categories of your apps, including in-app purchases
  2. Contacts: Depending on the information in your phonebook, your network.
  3. Media: use statistics on the quantity and type of files you save or produce on your device
  4. Calendar: your calendar’s events
  5. SMS: Your incoming and outgoing SMS messages and notifications

The above-mentioned information from your mobile device may be accessed by pera247 technology: your digital footprint, SMS messages, contacts, calendars, list, & application storage and a score can be created which can be used for awarding you credit.

Hurdles with AI credit assessment

AI models are black box models. The way they work is that they take into 100s or 1000s of data points and draw up thousands or even millions of linear or nonlinear rules to classify a new data point into categories or score them on a scale. When you pass a new data point, the model runs the rules and spits out its prediction. It becomes impossible to connect the input data to its predicted value, because there are so many rules involved.

Now imagine a situation where as a bank or an NBFC, credit assessment was done through an AI model. The AI model gives you a go ahead and you give out credit, but that person/entity ends up defaulting. If an AI model were used in Credit assessment you would find it hard to justify the assessment because you have no way of connecting the inputs (which could be in hundreds) to the false positive result. Input values, rules in the AI model and processing logic will fall out of human comprehension. You won’t have a way of finding out what went wrong and how to correct it.

Although AI may assist model creators in lowering model risk and increasing the overall predictive ability of models, a sizable portion of the financial sector is still wary of the explainability hurdle that machine learning techniques must overcome. Indeed, improvements in model accuracy frequently come at the expense of their capacity to be explained. Additionally, this lack of explanation presents credit professionals with a moral and practical dilemma.

One way to circumvent this problem would be to standardize AI use. By standardization, I mean, standardization of input data or standardization of models used. More work needs to be done on this part to come up with an acceptable AI credit assessment model that can be used widely in the industry.

References

  1. https://www.investopedia.com/ask/answers/040115/what-most-important-c-five-cs-credit.asp
  2. https://www.businessinsider.in/slideshows/miscellaneous/many-countries-dont-use-credit-scores-like-the-us-heres-how-they-determine-your-worth/slidelist/65479903.cms#slideid=65479913
  3. https://www.fico.com/blogs/using-alternative-data-credit-risk-modelling
  4. https://www.pera247.ph/faq/mobile-data

--

--