If you want to learn Data Science, take a few of these statistics classes

Image credit

Now onto statistics and probability.

Class Central’s homepage.

How we picked courses to consider

  1. It must be an introductory course with little to no statistics or probability experience required.
  2. It must be on-demand or offered every few months.
  3. It must be of decent length: at least ten hours in total for estimated completion.
  4. It must be an interactive online course, so no books or read-only tutorials. Though these are viable ways to learn statistics and probability, this guide focuses on courses.

How we evaluated courses

  1. The degree to which each course teaches statistics through coding up examples — preferably in R or Python.
  2. Coverage of the fundamentals of probability and statistics. Covering descriptive statistics, inferential statistics, and probability theory is ideal.
  3. How much of the syllabus is relevant to data science? Does the syllabus have specialized content like genomics, as several biostatistics courses do? Does the syllabus cover advanced concepts not often used in data science?
R and Python are the two most popular programming languages for data science.

Why Target Coding?

Statistics AND Probability

Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.

Our picks for the best statistics and probability courses for data scientists are…

The University of Texas at Austin’s edX page.
A promotional video from the “Foundations of Data Analysis” instructor, Michael J. Mahometa.

A stellar specialization

Duke University’s Coursera page.
A promotional video from the Statistics with R instructor, Dr. Mine Çetinkaya-Rundel, for the specialization’s original course, Data Analysis and Statistical Inference.

Want more probability?

MIT’s edX page.
A promotional video from the course’s instructors, John Tsitsiklis and Patrick Jaillet.

The competition

  • MedStats: Statistics in Medicine (Stanford University/Stanford OpenEdx): Great syllabus where the examples have a medical focus. Covers a bit of R programming at the end, though not as much as UT Austin’s series. A worthy option for anyone, even those not targeting medicine. It has a 4.58-star weighted average rating over 32 reviews.
  • SOC120x: I “Heart” Stats: Learning to Love Statistics (University of Notre Dame/edX): Targets a non-technical audience, though likely would be good for anyone. No coding. Good production value. Course and instructors look really fun. It has a 4.54-star weighted average rating over 12 reviews.
  • QM101x: Statistics for Business (Indian Institute of Management Bangalore/edX): Part of a 4-course series. Business focus. Good syllabus that uses coding. The last two courses in the series are unreleased as of November 2016 so can’t make a judgment yet. It has a 4.43-star weighted average rating over 27 reviews.
  • Workshop in Probability and Statistics (Udemy): Taught by Dr. George Ingersoll, Associate Dean of Executive MBA Programs at the UCLA Anderson School of Management. Costs money. Uses Excel. It has a 4.4-star weighted average rating over 452 reviews.
  • Intro to Descriptive Statistics (San Jose State University/Udacity): Part of a 2-course series. Bite-sized videos. No coding. It has a 3.88-star weighted average rating over 8 reviews.
  • Intro to Inferential Statistics (San Jose State University/Udacity): Part of a 2-course series. I took both courses as refreshers for my undergrad statistics classes and came away with a deeper understanding. Really enjoyed Katie Kormanik’s teaching style (see video below). Bite-sized videos. No coding. It has a 4.4-star weighted average rating over 5 reviews.
The intro to Udacity’s Intro to Inferential Statistics course. This course is meant to be taken after Udacity’s Intro to Descriptive Statistics course.
The University of Amsterdam’s Methods and Statistics in Social Sciences Specialization contains Basic Statistics and Inferential Statistics.
  • PH525.1x: Statistics and R (Harvard University/edX): Part of a 7-course series on edX. Life sciences focus. Uses R programming, but the reviews suggest UT Austin’s series is better. It has a 3.96-star weighted average rating over 26 reviews.
  • PH525.3x: Statistical Inference and Modeling for High-throughput Experiments (Harvard University/edX): Part of a 7-course series on edX. Life sciences focus. Uses R programming, but the reviews suggest UT Austin’s series is better. It has a 4.63-star weighted average rating over 4 reviews.
  • Intro to Statistics (Udacity): This is one of Udacity’s earliest courses and it has its shortcomings, as described in this memorable review by a college educator. No coding. It has a 3.93-star weighted average rating over 41 reviews.
  • Mathematical Biostatistics Boot Camp 1 (Johns Hopkins University/Coursera): Part of a 2-course series. Biostatistics focus. It has a 3.13-star weighted average rating over 23 reviews.
  • Mathematical Biostatistics Boot Camp 2 (Johns Hopkins University/Coursera): Part of a 2-course series. Biostatistics focus. It has a 3.83-star weighted average rating over 3 reviews.
  • KIexploRx: Explore Statistics with R (Karolinska Institutet/edX): More of a data exploration course than a statistics course. Uses coding. It has a 3.77-star weighted average rating over 22 reviews.
  • Statistical Inference (Johns Hopkins University/Coursera): One of two statistics courses in JHU’s data science specialization. Bad reviews. It has a 2.9-star weighted average rating over 29 reviews.
  • Regression Models (Johns Hopkins University/Coursera): One of two statistics courses in JHU’s data science specialization. Bad reviews. It has a 2.73-star weighted average rating over 30 reviews.
  • DS101X: Statistical Thinking for Data Science and Analytics(Columbia University/edX): Part of the Microsoft Professional Program Certificate in Data Science. Short syllabus. Bad reviews. It has a 2.77-star weighted average rating over 24 reviews.
  • Understanding Clinical Research: Behind the Statistics (University of Cape Town/Coursera): “This isn’t a comprehensive statistics course, but it offers a practical orientation to the field of medical research and commonly used statistical analysis.” Health care focus. It has a 5-star weighted average rating over 15 reviews.
  • MED101x: Introduction to Applied Biostatistics: Statistics for Medical Research (Osaka University/edX): Biostatistics focus. Uses coding. It has a 4.5-star weighted average rating over 3 reviews.
  • Probability and Statistics (Stanford University/Stanford OpenEdx): Curriculum looks great. The one review is really positive. No coding. It has a 4.5-star weighted average rating over 1 review.
Stanford’s Probability and Statistics course looks great, but lacks reviews.

Wrapping it Up

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store