Gradient descent: linear & logistic regression
Recently I have been coding ML and DL algorithms by hand to really understand how they work. An oldie but an essential algorithm is gradient descent, it is used in all neural network constructions and many machine learning algorithms like SVM use gradient descent.
Below is gradient descent on a toy problem. The following steps are used to execute gradient descent:
0. initialize w and b (pick a value)
1. compute y_hat = w*x + b
2. compute cost to see if w, b are ideal : J
3. compute partial derivatives of dJ/dw, dJ/db — they tell us how much we can change w wrt to J (the size of the ‘steps down the hill’)
4. update w and b
w = w — (dJ/dw)*learning_rate
b = b — (dJ/db)*learning_rate
5. repeat the steps until w, b do not change — this means that we are at the minimum of J
Load data
This dataset gives the size, weight, and species of fish. I use both linear and logistic regression to predict fish Weight from Height and Width. I obtained this data from Kaggle: https://www.kaggle.com/datasets/aungpyaeap/fish-market .
Subfunctions
https://gist.github.com/j622amilah/96b1a586742e0deb02a390d65ed3a9f6
Prepare the data
Plotting
Sci-kit learn: Linear Regression
Tensorflow: Linear Regression
By-hand: Linear Regression
Sci-kit learn: Logistic Regression
In the sci-kit learn formulation, y has to be binary or multi-category. So, we cannot estimate each Weight (y) point with a continuous vector, as is the case with true logistic regression.
Tensorflow: Logistic Regression
By-hand: Logistic Regression
Interestingly, my hand-made version has a cost that increases no matter how I tune the parameters. I added a clipping value on the sigmoid function to prevent infinite cost for when y_hat=1, in the compute_loss function. I have to keep tinkering with these values, the data, or the construction, and the cost should converge. This same code worked for a different dataset, the Coursera dataset for the Supervised Learning online class. Perhaps I need to scale the data…
RESULT
If we compare the mean squared error for all linear and logistic regression versions, the Sci-kit learn linear regression has the lowest mean-squared error and thus best fit for this dataset. The handmade linear regression version is second!
Happy practicing!