Weight and bias calculation with a linear regression example

Max Curie
3 min readJan 29, 2023

A concise description of neural network calculating weight (w) and bias (b)

Here are the basic steps for calculating the weight and bias.

Setup: input is x, output y, layers a described below

Where a is the layer nodes, g is the activation function, w is the weight, and b is the bias

Step 1: initiate weight (w) and bias (b)
Step 2: propagate the w and b in the layers
Step 3: calculate the cost function and its gradient
Step 4: backpropagate update the weight (w) and bias (b), w=w+dw, b=b+db
Step 5: repeat step 2~4 until the cost function is low enough

A detailed example of linear regress will be presented below:

with input x=[1,2,…10], y=[2,4, …, 20], in other word, y=2x (w=2, b=0)
We have a linear activation function (g=1), and Mean Squared Error loss function L=(a-y)²/2. We omit the superscripts for simplicity (since we only have one lay).

Step 1: initiate w=1, b=0
Step 2: propagate

Step 3: calculate the cost function J

Calculate the gradient for weight

Similarly, calculate the gradient for bias

Step 4: update the weight (w) and bias (b), where alpha is the learning rate

similarly, for b

Now repeat step 2~4 to until we get pretty low cost function J, and then w and b will be close to 2 and 0

The code to do that (python)

https://github.com/maxtcurie/Artificial_Intelligence/blob/main/0My_teaching/Weight_calc_example_linear.py

import numpy as np 
import matplotlib.pyplot as plt

#function to fit
x=np.arange(1,11) #input x=[1,2,...10]
y=2.*x #output

#learning rate
alpha=0.01

#criteria for cost function J
J_CRIT=0.01

#def cost function
def J_calc(a,y):
J=0.5*np.mean( (a-y)**2. ) #Mean Squared Error
return J

#Step 1: Initate w (weight), b (bias)
w=1
b=0

g=1 #linear activation function

J=100
w_list=[w]
b_list=[b]

while J>J_CRIT:
#Step 2: Propage
a2=g*(w*x+b)

#Step 3: calculate cost function
J=J_calc(a2,y)

#Step 4: calculate the gradient
#4.1.1 dJ_dw=dJ/dw
dJ_dw=np.mean((a2-y)*x)
#4.1.2 dw=-alpha*(dJ/dw)
dw=-alpha*dJ_dw

#4.2.1 dJ_db=dJ/db
dJ_db=np.mean(a2-y)
#4.2.2 db=-alpha*(dJ/db)
db=-alpha*dJ_db

#4.3 update w and b
w=w+dw
b=b+db

w_list.append(w)
b_list.append(b)
print('w_list: ')
print(w_list)
print('b_list: ')
print(b_list)

x=np.arange(0,11,1) #include 0 to plotting

plt.clf()
for (w,b) in zip(w_list,b_list):
plt.plot(x,w*x+b,color='blue',alpha=0.3)
plt.plot(x,w_list[-1]*x+b_list[-1],color='blue',label='final fit')
plt.scatter(x,2*x,color='red',label='data')
plt.grid(alpha=0.3)
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(0,10)
plt.ylim(0,20)
plt.legend()
plt.show()

The outputs

w_list:
[1, 1.385, 1.61875, 1.760676125, 1.846855961875, 1.89919220538125, 1.9309819765339844, 1.9502979545119492, 1.9620410610579773, 1.969186644705852]
b_list:
[0, 0.055, 0.08827499999999999, 0.108361, 0.12044020312499999, 0.127658723190625, 0.13192656466275, 0.13440329030675335, 0.1357928699055286, 0.13652268284828456]

The plot

This plot shows the final fit (solid blue line), data (red dot), and progression of the fitting light blue lines from the bottom (initial), gradually to the final fit.

Appendix

The format for derivation is as follows: the down arrow is the additional information to arrive at the next step.

Link: https://youtu.be/nNNRM-Wf_Z0

There is a video that I made on my YouTube Channel. I think it may be helpful.

--

--