Update:

PK Banks
data-science-machine-learning-101
4 min readJan 17, 2017

Stanford Machine Learning: this is hard
Columbia Data Science Statistics Course: I’m happy I took it, ecstatic that it’s over

(sorry for the lag between posts)
here’s what i’ve done since my last update:

Stanford Machine Learning (Coursera)

I finished weeks 1 and 2 of course content, which was fine.

The material is fairly technical, which is to say that if you were good at math in high school calculus then you will enjoy it. For the rest of us, it’s a challenge to revisit matrix math, derivates and linear algebra.

The course instructor is good about stating things in a simple manner. However, I find the style of communication to be dry. There’s also a paucity of code challenges along the way. Such code challenges perform a valuable component to the learning process, as it gives students the opportunity to apply the concepts rather than passively following along.

(Aside: It’s a surprise to me that the course is regarded to be the gold standard in the field. I find it to be not very well conveyed and the structure to be very thin. It’s awesome to have a series of recorded lectures in machine learning by such a prominent person in the space, but this is taught as though we have a private tutor, who sits at a computer in his basement and we watch over his shoulder as he talks and points at stuff on his screen. At least this private tutor is free, and we get to replay what he says without limit, but it’s still not a great format.)

As a result, I lacked command with the material and I had extreme difficulty implementing the most basic elements of the first problem set, involving gradient descent of a univariate regression model.

The coursework is difficult because we have to not only learn about concept (i.e., the mechanics of how predictive modeling works), but also about the implementation (i.e., how to actually build the model) using a software package (Octave) that is entirely new to me.

I do not think that this is an insurmountable task. I think that with persistence and repetition, I can learn and acquire these skills. These things are typical points of resistance while learning something new.

My difficulty with the problem set inspired me to focus on other tasks to wash off the residue of confusion and frustration with this course. It’s been about two weeks, so perhaps it’s time to go back to it.

ColumbiaX: DS101X Statistical Thinking for Data Science and Analytics

I completed this course within the prescribed 5-week timeframe.

The course material is lecture-intensive, with virtually zero code challenges or problem sets.

The video lectures are barely acceptable. While the instructors may be experts in their field, their teaching style is difficult to sit through. Being good at research and publishing papers does not a good teacher make.

There are two highlights worth mentioning:

• The Data Visualization section, by James Curley, is exceptionally well taught. He provides specific examples of data visualization techniques and examples of case studies. I highly recommend this section for anyone interested in data visualization.
You can see the first section in the video here:
https://youtu.be/_CXWWq6RmEo
See James Curley’s page here:
http://curleylab.psych.columbia.edu/curley.html

• The Business Insights section, by Eva Ascarza, is an excellent module. Eva explains how we can use Bayesian statistics in the context of business. She adeptly bridges the gap between abstract theory and applied knowledge. She also uses multiple examples of case studies to help guide us through the process. (The only thing missing here is a walk through the actual implementation of the model. I regret that we are shown only the theory and some hypothetical results.)

You can see the first section in the video here:
https://youtu.be/_J39vFtRPK8

My assessment of the Columbia course, which I am taking because it is a part of the Microsoft Data Science track, is that I’m glad I did it and I’m excited to be done with it. While it serves some good, it was merely a ‘check-the-box’ kind of a course. Unless you are abundantly motivated or exceptionally love how academics speak, it is very likely that you get bored at some point and not complete the course.

What’s next:
Harvard’s CS50, with LaunchCode
// This is a program in Miami, that integrates the well-known CS50 class with classroom seminars and real-life teaching assistants.
I’m doing this as a fun activity without a specific agenda. It’s a way to be involved in the community and potentially discover something great without expecting it.

ColumbiaX: CSMM.102x Machine Learning
// As I have struggled with the Stanford course, I am looking at this course as an alternative. I value learning the same thing from different sources. Sometimes it helps to hear the same lessons phrased differently or communicated in a different manner. I’m going to look into this and see if it’s a better fit for me.

--

--