Machine Learning with Tensor flow #2

Ashley Hendrickson
The Startup
Published in
5 min readOct 20, 2020

Logistic regression

Logistic regression is a statistical approach used to model the probability of a certain binary event such as pass/fail. This form’s the basic principle that will later be used in image recognition and other classification problems. There is a since of accuracy for classification, unlike regression which is just an approximation. Logistic regression is used to approximate exponential growth or exponential decay.

We’ll start our example first by importing all our imports

For this example I’ll be populating a list with values that grow exponentially over a time period of two. I’m using a formula but this would also work with random values as well, your end result won’t be as exact though. We will be asking the question how long it takes for our function to double?

This is just a function to generate points that increase exponentially
This code segment populates our list with exponentially increasing values

This model is growing exponentially however with some manipulation we can make it look linear.

take the log of the right hand side
convert the logarithm to base 10
Use the logarithmic property of division
Multiply both sides by r
Now we just need to subtract C
Our final product

The result closely resembles y=mx+b. Meaning we can use linear regression.

We take the data and calculate the log of each value, we set the list equal to y. Once the points are plotted there appears to be a clear linear correlation

Since the values on the x-axis are so big this will throw our estimation off so we must first center the x values. We can do that by subtracting the mean

Now the model must be instantiated

The sequential model is only appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor

The input shape is a shape tuple of integers which doesn’t include the batch size since our input is one, that indicates that the expected input will be a batch of one dimensional vectors.

The Dense layer is a regular neural network activation function with weights and biases it’s the most commonly and frequently used layer

We are passing in 1 to our dense layer because its 1 dimensionality of output space

We can also get some important info from our model which will be very useful when we run into bugs later on with the model

The model is then compiled for the optimizer I am using SGD (stochastic gradient descent). The only reason we’re using SGD is to just formalize everyone with different optimizers. Adam is a little newer and tries to optimize SGD. However under certain circumstances Adam has failed to converge to an optimal solution and has been out preformed by SGD. The two parameters that we are passing in is the learn rate and the momentum.

For the loss function we are just using the mean squared error

The learning rate scheduler function takes in another function with the parameters of the number of epoch’s and the learning rate. An epoch represents the number of times all the training vectors are used to update the weights and biases. The schedule function is very useful for more complicated situations such as, for n number of epoch change the learning rate by m.

For every epoch this callback gets the updated learning rate value from the scheduler function. The current epoch and learning rate is then applied to the optimizers learning rate. The number of epochs isn’t super important the more epochs the more your weights will get updated and the longer your model will take to fit your data.

Our losses per iteration converge which is great otherwise we’d have to do some debugging.

get_weights returns the weights of the layers of the numpy list. So for our purposes it returns a one by one matrix and a vector of length 1

So r or 0.49995455 is our slope for our linear equation. Now we need to find the difference between our original expediential growth function and our doubled expediential growth function. Since we want to know how long it takes to double.

Double growth function
Original growth function
We start of by dividing the two functions and cancel out as many variables as we can
Which leaves us with the difference between the functions doubling which we can refer to as delta t

However our values were computed in respect to our linear formula so the rate of log r is already calculated

Our final result for the time difference it takes to double

So the time it takes this function to double is 2.000181810692627 units. The actual answer is 2 so the error is %0.090905

https://colab.research.google.com/drive/1S505dLc6hfVtYAqJTTh2xDRIUxLi8YzZ?usp=sharing

--

--