Car Sales Prediction — Using Basic Matrix Operations
We have been studying matrix operations like Ax = b from schooling, but have we ever realized how its simplified linear regression (y=mx+c) problem. Yes, Today we are going to use matrix operations to find slope(m) and intercept(c) for predicting the car resales value.
Before getting into problem let us try to recap Ax=b, with an example as shown below.
Where A is matrix of size m x n, linearly combined with x (n x 1) to obtain the target variable b (n x 1)
How to relate this to Linear regression Problem?
In our Car sales data, we have two columns one is Power_perf_factor and Resale Value, where we need to predict the resale value(y) with respect to input feature power factor(x).
resale value = slope * power factor + intercept
to represent this in matrix format, we need to form a matrix A with input feature (power factor) as one column and one column with ones for adding bias, B matrix will be our resale value and x will be our slope and intercept variable.
For single feature we might think matrix operation is not much necessary but think if you have billions of rows and billions of features to predict the car resale value, then representing the equation (y = mx + c) in matrix format makes our life easier for prediction.
Python Implementation
#importing necessary libraries# pandas -- > dataframe operations
# matplot --> visualizations
# numpy --> Matrix operationsimport pandas as pd
import matplotlib.pyplot as plt
import numpy as np#reading the data set
data=pd.read_csv('Car_sales.csv')#storing input feature in variable x
x = np.array(data['Power_perf_factor'].values)#formating the input feature as matrix A as shown in above figure
mat_A = np.transpose(np.vstack((x,np.ones(len(x)))))#calculating slope and intercept (x = B/A)
output = np.matmul(np.linalg.pinv(mat_A),np.expand_dims(data['Resale_value'].values,1))#calculating ydash value based on slope and intercept(y=mx+c)
ydash = output[0,:]*x + output[1,:]#plotting the original y value and predicted y value
plt.figure(figsize=(20,5))plt.plot(y, label = 'True Label')
plt.plot(ydash, label = 'Predicted')
#naming y axis
plt.ylabel('Resale Value')#to show legend
plt.legend()
plt.show()