Building Flexible Credit Decisioning for an Expanded Credit Box

Published in

Lendgineering

8 min readJun 12, 2019

At LendKey, we’re building a platform that improves lives with lending made simple. Our innovative cloud technology creates the most transparent online lending platform for consumers shopping for low-cost borrowing options from community banks and credit unions.

Credit decisioning– deciding whether or not to lend by predicting how likely applicants are to repay a loan– is a component of our platform that is critical to our lending partners.

The Problem

Enabling our lending partners to make credit decisions quickly, accurately, and in a way that can be explained is critical to our success. Our legacy credit decisioning system met those criteria, but it had three flaws:

It only supported a single ruleset-style model for determining if a loan application exceeded our lending partners’ limits for credit risk. Every rule was considered in isolation. If a loan application couldn’t meet every single criteria, it was declined.
Its rules for determining credit risk were implemented in code. While it was possible for our lending partners to enable and disable particular rules and customize the threshold for each rule, it was challenging and time-consuming to add new rules.
It was part of a monolithic application. This made it impossible for us to offer it à la carte as well as making it difficult for us to improve and maintain it.

Seeing the Whole Picture

We began this initiative with the simple hypothesis that a more robust credit decisioning engine could give our lending partners a better toolset for approving loans.

We strongly suspected that applicants who would be able to successfully repay loans were being denied due to the simplistic “one strike and you’re out” nature of our platform’s decisioning engine and the underwriting utilized by our lending partners. For example, if our lending partners’ credit risk models required an applicant to have a gross monthly income of at least $5,000 and she had an income of $4,999, she would be declined– regardless of the other characteristics of her application.

A more sophisticated model would consider the applicants holistically. If the applicant’s income was slightly low, but the total balance on her other accounts was small, her FICO score was high, she never had a late payment, and she was requesting a small amount, she might still be eligible for a loan.

Careful analysis of historical lending data by our data science team confirmed that credit risk models that considered applicants holistically would allow our lending partners to approve significantly more applications without increasing their risk.

Predictive Models Everywhere

Predictive models are simply algorithms for forecasting an outcome. They are developed using statistical techniques, like regression, to determine the relationship between dependent variables (like risk of loan default) and one or more independent variables (like income and the number of accounts past due.) In recent years, machine learning techniques have greatly increased the speed with which new models can be developed.

Predictive models can be implemented in many different ways depending on the application. For example, when making a credit decision, it is imperative that a lender be able to explain exactly why it made that decision. A “scorecard” model will allow our lending partners to consider loan applicants holistically and also to explain their lending decisions. In this type of model, applicants are assigned points for each of the characteristics considered by the model. The closer an applicant comes to the ideal value for that characteristic, the more points he or she is awarded. The overall score is the sum of these partial scores and represents how likely the applicant is to repay the loan. To determine which characteristics were most significant to the credit decision, we simply compare the points the applicant received for each characteristic to the maximum points possible for that characteristic and sort the characteristics based on that difference.

Our new holistic credit risk models are one application for predictive models, but we quickly thought of more. For example, we might:

Monitor the health of the loan portfolio we service on an on-going basis;
Flag applications that are likely to be fraudulent;
Automate time-consuming manual review processes;
Score the quality of the data provided by the applicant and third-parties to cut back on the situations when we require applicants to upload documents; and
Make more relevant product suggestions.

We knew we had to build a general-purpose tool for evaluating predictive models at scale.

Building on a Strong Foundation

Having software engineers re-implement the predictive models created by our data scientists and lending partners would be time-consuming and costly. Wouldn’t it be better if we could load the models that they developed directly into our lending platform? Enter PMML, Predictive Model Markup Language, a well-established industry standard format for describing predictive models. PMML files can be generated by the tools that data scientists use, like SAS, R, and Python. It is used extensively within analytics-heavy companies like Netflix and Airbnb. JPMML-Evaluator is a popular open source tool for evaluating PMML models.

PMML supports many different types of predictive models, including scorecards, rulesets, regression, clustering, and even neural networks.

PMML and the tools that produce it are perfect for data scientists, but they have a steep learning curve. It would be great to also allow non-technical people to create simple rulesets for credit decisioning. LendKey had already adopted the Camunda business process modelling platform, and it seemed like the perfect fit for this use case because it provides a simple user interface for describing business rules.

We had already identified two different engines for making credit decisions, and we knew there might be more. What if one of our lending partners wanted us to integrate our lending platform with a proprietary decisioning system? What if we wanted to add support for the complicated but powerful successor to PMML, PFA (Portable Framework for Analytics)? We definitely didn’t want to maintain support for all these predictive model evaluators in all of the exciting new apps we were planning. We needed a layer of abstraction around predictive models!

Introducing Insights

Rather than updating all of our apps to talk to multiple predictive model engines, we wanted to create a single common “vocabulary” for interacting with predictive models. We designed the Insights service as a well-defined and standards-compliant API for interacting with predictive models. It translates our apps’ requests into specific calls to each of the supported predictive model engines and then translates their responses back into a common format.

When our predictive models are used to do things like make credit decisions, we need to be able to “show our work.” We need to be able to demonstrate to our lending partners that we are correctly enforcing their underwriting criteria. Our data scientists also need to get feedback on our predictive models so that we can improve them over time. For these reasons, the Insights service stores the details of every predictive model transaction, including all of the arguments and results, in an encrypted data store. Our loan origination systems can then associate the unique identifiers for each Insights transaction with the loan event that spawned it and display the transaction’s details to anyone with the appropriate permissions.

We need a general-purpose, data-driven tool to evaluate predictive models, so Insights is, by design, completely ignorant of our business rules and concepts. Insights has no code specific to the concept of a “loan,” an “applicant,” or a “credit score.” It is the responsibility of the applications that use Insights to inquire about which arguments a particular predictive model requires and determine the values for those arguments before asking Insights to evaluate that predictive model.

We get our predictive models from a wide variety of sources, both inside and outside of LendKey. Different predictive models may refer to the same concept, like “primary applicant’s debt-to-income ratio,” using different names. When we set up a new model in Insights, we “map” the arguments it expects and results it generates to fields in Insights’ data dictionary. In this way, consumers of the Insights API are only required to support the fields in the data dictionary, rather than being required to support every field in every predictive model.

A Sample Transaction

In the following example, our app would like to use Insights to determine the likelihood that my daughter’s soccer game will be cancelled. Our app will first make a request to the Insights service to find out which arguments our model requires. It will then determine the values for all of these arguments and then make a second request to evaluate the model. Finally, it will take some action based on the results.

Sequence diagram showing a typical transaction with the Insights API

We wanted the Insights API to feel instantly familiar to the developers who used it, so we chose to build on industry standards like JSON:API and provide OpenAPI documentation.

First, we ask Insights for the arguments required by our soccer cancellation model, which is identified by a UUID.

GET /insights/v1/predictive_models/00000000–0000–0000–0000–888888888888/fields

{
    "data": [
        {
            "id": "473bb0a3-2e2a-45cb-a453-d9cf97012414",
            "type": "predictiveModelField",
            "attributes": {
                "name": "temp_in_degrees_f",
                "supportedFieldName": "temperature",
                "fieldType": "argument",
                "dataType": "number"
            }
        },
        {
            "id": "15f9eea2-c38d-4dca-9376-857a71c06e3c",
            "type": "predictiveModelField",
            "attributes": {
                "name": "lightning_in_area",
                "supportedFieldName": "lightningReported",
                "fieldType": "argument",
                "dataType": "boolean"
            }
        },
        {
            "id": "dd5fefc9-dcf8-40a5-b78e-dc3891d044c8",
            "type": "predictiveModelField",
            "attributes": {
                "name": "heavy_rain_in_last_24h",
                "supportedFieldName": "heavyRainRecently",
                "fieldType": "argument",
                "dataType": "boolean"
            }
        }
    ]
}

We can see that this model requires three arguments: temperature, lightningReported, and heavyRainRecently. Our app would look up those values and use them to create a request to evaluate the model.

POST /insights/v1/insights

{
    "data": {
        "type": "insight",
        "attributes": {
            "modelId": "00000000-0000-0000-0000-888888888888",
            "arguments": {
             "temperature": 67,
             "lightningReported": false,
             "heavyRainRecently": true
            }
        }
    }
}

We tell Insights the unique identifier of the predictive model we’d like to use and the values for each of the fields required by that model. I don’t have to know any of the details about which engine will actually be evaluating this model– Insights already knows.

Insights will translate our request into the format required by the appropriate engine and use that engine to evaluate the specified predictive model. The model will make a prediction about whether or not we will be playing soccer today. Perhaps it will be based on a correlation between recent heavy rain and a muddy field.

Insights evaluates the model, securely stores all of the arguments and results, and responds.

{
    "data": {
        "id": "6e4866f9-7e99-40d0-afe3-1e925fe6e083",
        "type": "insight",
        "attributes": {
            "created": "2019-05-14T18:49:05Z",
            "arguments": {
                "temperature": 45,
                "lightningReported": false,
                "heavyRainRecently": true
            },
            "results": {
                "willPlay": "probably not",
                "reason": "muddy field"
            }
        },
        "relationships": {
            "predictiveModel": {
                "data": {
                    "id": "00000000-0000-0000-0000-888888888888",
                    "type": "predictiveModel"
                }
            }
        }
    }
}

We probably won’t be playing soccer today.

The Insights API also provides endpoints for retrieving the details of previous transactions– which is essential for auditing– and managing predictive models, including mapping their fields to fields in Insights’ data dictionary.

What’s Next

LendKey will continue to work with our lending partners to develop more accurate predictive models. The models that our lending partners use for credit decisioning must always be able to “show their work”– explain why they made a particular decision, including the application characteristics that were considered. This disqualifies certain types of machine learning, but ensures that all applicants are treated fairly.

Insights currently offers a RESTful HTTP API. This was simple to implement and perfect for the low-volume nature of credit decisioning. We are now exploring ways to improve its performance. These might include changes to our database encryption library and offering a low latency streaming interface built on top of our inter-application messaging bus, Kafka.

With Insights, we’ve built a strong foundation to leverage predictive models throughout our lending platform. We’re excited about being able to offer loans to people who would have otherwise been declined, and we can’t wait to use this new toolset to streamline the loan application process.