Building a Linear Model from scratch with NumPy

Pradnya Saval
5 min readMay 6, 2020

--

Let us build a simple Linear model which is a basic to create more complicated models.

Step 1: Import python libraries:

#import the relevant libraries for the problem. NumPy is a must for this model.
import numpy as np

# matplotlib and mpl_toolkits are not mandatory it is just used for visualizing the results.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

Step 2: Create random input to train the model

#declare a variable containing the size of the training set we want to generate.
observations = 1000

# Two variables as inputs are used namely as x1 and x2
# Generate them randomly, drawing from an uniform distribution.
# 3 arguments of this method are (low, high, size).
# The size of xs and zs is observations by 1. In this case: 1000 x 1.
xs=np.random.uniform(low=-10,high=10,size=(observations,1))
zs=np.random.uniform(-10,10,(observations,1))

# Combine the two dimensions of the input into one input matrix.
# This is the X matrix from the linear model y = x*w + b.
# column_stack is a Numpy method, which combines two vectors into a matrix. Alternatives are stack, dstack, hstack, etc.
inputs = np.column_stack((xs,zs))

# The dimensions of the inputs should be n x k, where n is the number of observations, and k is the number of variables, so 1000 x 2.
print (inputs.shape)

Output: (1000,2)

Step 3: Create targets:

# make up a function by using the ML methodology, and see if the algorithm has learned it.
# We add a small random noise to the function i.e. f(x,z) = 2x — 3z + 5 + <small noise>
noise = np.random.uniform(-1,1,(observations,1))

# Produce the targets according to the f(x,z) = 2x — 3z + 5 + noise definition.
# In this way, we are basically saying: the weights should be 2 and -3, while the bias is 5.
targets = 2*xs — 3*zs + 5 + noise

#The shape of the targets just should be n x m, where m is the number of output variables, so 1000 x 1.
print (targets.shape)

Output: (1000,1)

Step 4: Plot the training data:

# In order to use the 3D plot, the objects should have a certain shape, so we reshape the targets.
# The proper method to use is reshape and takes as arguments the dimensions in which we want to fit the object.
targets = targets.reshape(observations,)

# Declare the figure
fig = plt.figure()

# Method allowing us to create the 3D plot
ax = fig.add_subplot(111,projection=’3d’)

# Choose the axes.
ax.plot(xs,zs,targets)

# Set labels
ax.set_xlabel(‘xs’)
ax.set_xlabel(‘zs’)
ax.set_xlabel(‘Targets’)

# You can change the azim parameter to plot the data from different angles.
ax.view_init(azim=100)

# The method shows the plot.
plt.show()

# Reshape the targets back to the shape that they were in before plotting.
targets = targets.reshape(observations,1)

Output:

Step 5: Initialize variable

# Initialize the weights and biases randomly in some small initial range.
# init_range is the variable used to measure that.
init_range = 0.1

# Weights are of size k x m, where k is the number of input variables and m is the number of output variables
# In our case, the weights matrix is 2x1 since there are 2 inputs (x and z) and one output (y)
weights = np.random.uniform(low=-init_range, high=init_range, size=(2, 1))

# Biases are of size 1 since there is only 1 output.
#The bias is a scalar.
biases = np.random.uniform(low=-init_range, high=init_range, size=1)

#Print the weights to get a sense of how they were initialized.
print (weights)
print (biases)

Output:
[[ 0.02847566]
[-0.03640631]]
[-0.09045432]

Step 6: Set the learning rate and Train the Model

# Set some small learning rate, denoted eta
# 0.02 is going to work quite well for our example
learning_rate = 0.02

# We iterate over our training dataset 100 times. That works well with a learning rate of 0.02.
# a lower learning rate would need more iterations, while a higher learning rate would need less iterations
# a high learning rate may cause the loss to diverge to infinity, instead of converge to 0.
for i in range (100):

# This is the linear model: y = xw + b equation
outputs = np.dot(inputs,weights) + biases
# The deltas are the differences between the outputs and the targets
deltas = outputs — targets

# We are considering the L2-norm loss, but divided by 2
# Moreover, we further divide it by the number of observations.
# This is simple rescaling by a constant
# as any function holding the basic property of being lower for better results, and higher for worse results
# can be a loss function.
loss = np.sum(deltas ** 2) / 2 / observations

# print the loss function value at each step so we can observe whether it is decreasing as desired.
print (loss)

# Another small trick is to scale the deltas the same way as the loss function
# In this way our learning rate is independent of the number of samples (observations).
# that can remain the same if we change the number of training samples (observations).
deltas_scaled = deltas / observations

# Finally, we must apply the gradient descent update rules from the relevant lecture.
# The weights are 2x1, learning rate is 1x1 (scalar), inputs are 1000x2, and deltas_scaled are 1000x1
# We must transpose the inputs so that we get an allowed operation.
weights = weights — learning_rate * np.dot(inputs.T,deltas_scaled)
biases = biases — learning_rate * np.sum(deltas_scaled)


# The weights are updated in a linear algebraic way (a matrix minus another matrix)
# The biases, however, are just a single number here, so we must transform the deltas into a scalar.
# The two lines are both consistent with the gradient descent methodology.

Output:

Step 7: Print weights and biases

# print the weights and the biases, so we can see if they have converged to what we wanted.
# When declared the targets, following the f(x,z), we knew the weights should be 2 and -3, while the bias: 5.
print (weights, biases)

Output:
[[ 1.99693418]
[-3.00669486]] [4.3158515]

Step 8: Print Output vs Targets

# print the outputs and the targets in order to see if they have a linear relationship.
plt.plot(outputs,targets)
plt.xlabel(‘outputs’)
plt.ylabel(‘targets’)
plt.show()

Output:

Since they are the last ones at the end of the training, they represent the final model accuracy.
The closer this plot is to a 45 degree line, the closer target and output values are.

--

--