Building a Credit Card Recommender with Scikit-Learn and deploying on web and Chatbot

Tahsin Mayeesha
Learning Machine Learning
6 min readDec 28, 2018
Number of credit/debit cards for unique banks in my dataset

On a great foggy day, after several rounds of brainstorming, I came to the bright idea of building a credit card recommendation system for my CSE327 project. CSE327 is the software engineering course in my university and has a project requirement. After suggesting it to our instructor Dr. Nabeel Mohammed, he also agreed it’s a pretty interesting idea and with some constraints it can be a candidate project for the course. Thus, my exploration into credit and debit cards of Bangladesh began.

This blog post is going to focus more on the web development and model deployment parts instead of the machine learning related issues since we’re still doing active research on the project.

Aside from finishing the course, I had another alternative motive. I wanted to learn how to deploy machine learning models in web and other platforms. I also wanted to learn about general software engineering and web development skills. For another project with sensor data I chose to learn android app development, but that’s a separate blog post.

Context

Bangladesh is a developing country whose economy is growing fast. The GDP per capita of Bangladesh has grown to 1516 USD in 2017 compared to 88 USD of 1960. Individuals have more disposable income now and the infrastructure of the country has also improved a lot. Most banks have ATM booths all over the country.

Bangladesh most definitely does not have a cashless economy, but the credit card usage is growing. Grocery super shops, mobile, electronics companies, hotels, resorts, restaurants all offer different sort of discounts based on the card that is being used for payment. Given that , it’s a prime time for making a credit card recommendation system.

The model

My initial idea was to recommend cards based on only card features and use a similarity measure to recommend similar cards per user preference. Our teacher suggested that apart from the card features, we should also consider the location of the user for credit card recommendation. Even if the card preferences matched, the user may not have any nearby offices for the bank in the remote areas of Bangladesh as an agricultural country.

We collected data on 130 cards from 15+ unique banks. Features like card type, interest rate, credit limit, rewards associated with the card etc were collected. Since the bank providers change website design frequently and the data we’re collecting is small, I didn’t scrape the data by choice. All information that was collected is public.

Due to a lot of missing values I had to discard a few features I initially collected and also drop some cards where the information density was extremely low. I expected bank websites to have complete information about the cards, but that expectation was knocked out after beginning data collection. Majority of the cards do not even provide updated, correct information about their general properties like interest rate.

The dataset

The model is a basic nearest neighbor based model trained on scikit-learn which I pickled for deployment. Since the course was a software engineering course, not a machine learning one, it wouldn’t have made sense to focus more on improving the model instead of finishing the project requirements for the course.

For input the model takes in different preferences from the user , the outputs are the recommendation scores and indices from the unsupervised nearest neighbor models.

Model working in a prototype

Web Development with Django

My background is in machine learning. I don’t have much experience with web development, but I’ve worked a little bit with Flask. I didn’t expect django to be the sort of beast it actually is. For a beginner it’s quite hard to get the URL configurations of django working. Django as a framework comes with many functionalities like user verification, authentication, signing up etc out of the box, but it expects developers to work in the ‘django’ way.

A django project is basically one project consisting multiple ‘apps’. The apps can be pulled out from one project to another as needed. Django also has many other python packages on top of it so adding things like social authentication is pretty straight-forward. The URL configurations has to be defined in the project level initially and then redirected to app level as required. Django is the classic MVC framework, working with model-template-views where model is model, templates are the traditional views in MVC and controller mechanism is done with the views.

A request sent from client is first matched with the URL and sent to appropriate view function or class, the view handles the request and returns the appropriate response.

In this case, we expected the user will submit a form containing the preferences, the recommender will handle the request in the backend and generate the correct recommendations from the model.

User Input Form Demo

The database used was SQlite since it was a demo project and we had only 130 cards in the card dataset. We added user authentication features by setting the correct django urls, social authentication with gmail and general search functionalities to make the project nicer. To be really honest, the front end work does not interest me much , it was bootstrap copy-paste all along.

For location related features I didn’t have enough time in the project to integrated location scores with the card preference scored based on card similarity, so I decided to flag the nearby banks by searching in a 2000 meter radius from the user location.

The location API used was HERE Technologies which is an excellent intuitive API for location handling. Google maps required me to pay with a credit card and I didn’t want to start paying for a course project. For showing nearby banks the user inputs the area and the city, HERE API unfortunately is not that precise for countries like Bangladesh. Then the text is geocoded using the HERE geocoding API and the extracted latitude and longitude is used to search through the nearby banks in a given radius.

Google Dialogflow Integration

Since the project has extra marks for deploying to a mobile platform, we decided to use google dialogflow for making a chatbot which will recommend credit cards. The dialogflow code uses the same pickled model, but instead of making a REST api with the model and sending request from both the website and the chatbot, I ended up keeping the projects separate.

The chatbot connects to my machine via ngrok, which securely exposes the localhost server to public. The flask app with the bot code is running on localhost. Dialogflow vocabulary has many constructs like intents, entity etc, here I’ll focus on giving the basics in a few lines.

When a user talks to the chatbot dialogflow assumes the user has some intent. In this case we assume the intent is to find a credit or debit card recommendation. The intent can collect data from users with entities. We have multiple entities in the backend of dialogflow. For example an entity card-type may have multiple values like credit, debit or prepaid.

The collected data is sent to the server via webhook with a fulfillment request. The fulfillment request is a JSON request that goes to the webhook URL. The server connects to my localhost machine with ngrok, my machine runs the code, finds the recommendations and sends the fulfillment response to the chatbot. That response is ultimately shown. This is how a typical fulfillment request looks like.

Chatbot Demo

This is a demo conversation.

--

--

Tahsin Mayeesha
Learning Machine Learning

Deep Learning Engineer. New grad, CSE.GSOC 19 participant@Tensorflow. Previous GSOC 18 @ Berkman Klein Center of Internet and Society. Kaggler,fast. ai internat