Exploring Debt With Data Analysis
This article will mix two of my favorite things, finance and data analytics.
Background
Let’s imagine a scenario in which 5 years ago we saved $20,000 to buy a car in full. In this scenario there is no taxes, fess, or any other extra cost associated with purchasing a car. We went to the dealership, and we bought that nice $20,000 car. It’s 2020 now, and we wonder if that was really the best way to go about it. I mean, we have no monthly payments, so thats good right? On the other hand, we could have done monthly payments on that car and kept the $20,000 a little bit longer. But, there is interest associated with that, so we sure have done the right decision. If we kept the $20,000 a little bit longer, then maybe we could have used that money for something else. Perhaps we could have bought stocks. In order to investigate what we could have done differently, we will use python to research if investing our money in the stock market would have been a better choice than to use it to pay the car in full.
Let’s Get Started
First, we will be importing all the libraries that we’ll need:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
We’ll use pandas to work with data frames, Matplotlib to visualize data, and seaborn to visualize data. Now, we need to start by being able to calculate our monthly payments. We’ll code the following formula to python.

def monthlyPayment(loanAmount,annualIterest,numYears):
n = numYears*12
r = (annualIterest/100)/12
top = r*loanAmount
bottom = 1-((1+r)**-n)
payment = round(top/bottom,2)
return payment
In order to better visualize what we’re paying and when, we will create a function that generates an amortization schedule. An amortization schedule shows us how much interest and how much principal we are paying. The principal is basically the loan amount. We can generate that using the following function
def ammortizationSchedule(loanAmount,annualInterest,numYears):
payment = monthlyPayment(loanAmount,annualInterest,numYears)
interestAmount = []
principalBalance = []
prior_interest = 0
n = 12*numYears
r = (annualInterest/100)/12
for period in range(n):
interest = loanAmount * r
principalPayment = payment - interest
loanAmount -= principalPayment
prior_interest += interest
principalBalance.append(round(loanAmount,2))
interestAmount.append(round(prior_interest,2))
return pd.DataFrame({'Period':range(1,n+1),'Interest_Paid':interestAmount,'Principal_Balance':principalBalance})
The previous function will generate an amortization schedule, and it will return a data frame to better view it. Now, we need to research what the average car loan rate was back when we bought our car. According to The Fed, the average rate was 4.05% for a 5 year loan. Now, let’s plug in some numbers in our functions.
amSchedule = ammortizationSchedule(20000,4.05,5)
amSchedule.head()
We will get the following output.

Cool, so we see how payments look like for the first 5 periods. We can see the whole amortization schedule by removing the .head(). Now, let’s get monthly data on stock market returns. For this example, we will look at SPY returns. SPY is an ETF that tracks the S&P 500, so it is well diversified. It should expose us to a comfortable amount of risk and reward. We will be getting the data from Yahoo Finance as a CSV file, and we will be using pandas to manipulate it.
We will get this following output.

This helps us get familiar with how the data frame looks like. Now, we will get data from the past 5 years. For some reason, the data includes two data points for October of 2020, so we will get rid of one.
past5years = SPY.loc[(SPY['Date'] >= '2015-10-01') & (SPY['Date'] != '2020-10-02')]
Now, we want to find out what the change of the stock was every period for the past 5 years, and we want to sum the total of our returns.
def growthPerPeriod(moneyAmount,pricePerPeriod):
stockAmount = moneyAmount/list(pricePerPeriod)[0]
PL = []
difference = pricePerPeriod.diff()[1:]
total = 0
for i in difference:
value = stockAmount * i
total += value
PL.append(round(total,2))
return PL
The previous function will return a list of the sum-to-date of our returns. We are assuming that we are not selling any stock from our portfolio, and we are making payments from our regular income. Now, let’s call our function to calculate our returns.
totalReturns = growthPerPeriod(20000,past5years['Close'])
We have everything we need, so let’s visualize our results with the following code.
sns.lineplot(amSchedule['Period'],amSchedule['Interest_Paid'],label='Interest Paid')
sns.lineplot(amSchedule['Period'],amSchedule['Principal_Balance'],label='Principal Balance')
sns.lineplot(amSchedule['Period'],totalReturns,label='Stock Market Gain')
plt.legend()
plt.ylabel('Amount in $')
plt.xlabel('Payment Periods')
plt.show()
print('Total Returns from the Stock Market {}'.format(totalReturns[-1]))
print('Total Interest Paid {}'.format(list(amSchedule.Interest_Paid)[-1]))
print('Total Amount of Loan {}'.format(list(amSchedule.Principal_Balance)[0]+list(amSchedule.Interest_Paid)[-1]))
It will output the following.

Interpret The Graph
The orange line represents the balance of the loan, the blue line represents the sum of the interest paid, and the green line represents our stock market gains and lost. Both orange and blue line are expected. It is a linear relationship to the number of periods. As we pay our monthly payments, our total interest payment will go up and our balance on the loan will go down.
The green line is not as easy to understand, but we can tell that it generally goes up. One of the risks in investing in the stock market is that the future is uncertain, and we do not have any guarantees that we will make our money grow. For example, we can see that around the 50th period the stock we invested in took a sharp dive, but the stock gained it all back. Around that period is when coronavirus started. There was no way any one could have predicted that a pandemic was on the horizon, so the market was not prepared.
Overall, we would have gained $11,963.23 and paid $2,126.93 in interest. We can say that we would have turned a $9,836.3 profit.
The Wrap up
A $9,836.30 profit over 5 years is perhaps not a whole lot of money, but it is more money than we began with. However, when the sum of money increases, our profit will increase as well. Unfortunately, we made some assumptions that would not hold in real life. The concept that debt is bad is not always true. Debt is a financial tool to accomplish goals. The lower the cost of debt, the more of an attractive tool it is.
Besides the quick lesson in finance, this is a simple example of the power of data analytics. In the near future, data analytics will not be a competitive advantage, it will be a necessity to be in the business enviroment.
This article has limitations, and I am not a financial advisor. Do not take this article as financial advise.