Can You Solve This Data Science Problem?

Machine learning model for forecasting loan status

Benjamin Obi Tayo Ph.D.
The Startup

--

Photo by Karla Hernandez on Unsplash

Two years ago, I had a screening interview with a financial company that uses data science and analytics to predict the credit worthiness of it’s customers to determine how likely they are capable of repaying a loan in full. As part of the interview process, I was assigned a take-home challenge problem. Please see below for the project description and instructions.

The dataset for this problem can be downloaded from this GitHub repository.

The dataset here is complex (has 50,000 rows and 2 columns, and lots of missing values), and the problem is not very straightforward. You have to examine the dataset critically and then decide what model to use. This problem was to be solved in a week. It also specifies that a formal project report and an R script or Jupyter notebook file be submitted.

As of the moment of writing, I don’t know the solution to this problem or what type of model would be suitable for tackling this problem.

I would like to challenge you to try to solve this problem yourself and let me know what your solution is.

Model for forecasting loan status

--

--