#6 ML …. mini-hurray !!!

Abhinaba Bala
Aug 9, 2017 · 2 min read

9.8.17 Wednesday

A great day for mankind. Kidding. A great day for me :). After one week, the code for Mini Batch GD is finally here!!!!

It took me time to digest the fact that just tweaking the Batch GD code a little bit one could get for Mini Batch GD ! After that, the code for it was ready within 10 minutes :).

One week was gone into other stuffs, thankfully, today the output was great.

Code for both the methods would be uploaded in Github tomorrow.

About output:

For Batch GD: m=0.17701987668424726; b=0.3178956232725284; f=5.30973758237

For Mini Batch GD, the values for the parameters and cost-function would of-course come different in each case

Take 1: m=0.1923070083863004; b=0.18753066202974217; f=5.48722549445

Take 2: m=0.19330741310897784; b=0.18244372028429637; f=5.494385617

The hearty, quality results:

If one were to see the cost function and parameters after each step in both the cases, the results were matching at par with the predictions !!!

  1. For MBGD: It is known that the value of cost-function will have jumps even if it reaches the optimum. On studying the output I was pleasantly surprised to see so. The values were not strictly decreasing, which is what is true for this method.
  2. For BGD: The value of cost function decreased strictly. Also the output in BGD was better tuned and minimized which again is in accordance with the theory.
  3. Time taken for BGD was significantly higher than that of MBGD. On the small machine of mine, even with the small dataset I got to see why MBGD is used over BGD and the time difference in both of theirs compilation was recorded in a video by me. Although it depends on parameters too, I took the conventional values and time taken by BGD was almost 8–10 seconds more.

The outputs for Stochastic would be similar as only the batch length is to be changed.

So I have almost reached the completion of my aim, i.e., to study the main methods to evaluate cost functions and observe them.

For now I have no plans to go after Adam, Adagrad … methods. The next goal(independent of this blog) would be to study different aspects in divergence with respect to the data.

My learning of Machine Learning

Learn ML rather than use it as black box.

Learn ML rather than use it as black box.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade