Online Courses Data from 2012 to 2016
edx, an online learning platform co-founder by Harvard and MIT, published a set of data containing its course titles, subjects, launch year and aggregate statistics on participants and participation.
In general HarvardX offered more Humanities courses, which started in 2013 and grew quickly in the next two years. 2016 data goes till Aug 2016 and doesn’t cover the full year.
These free courses immediately attracted many participants, Computer science courses, for example, have a median enrolment of 21k attendees. People (including those who didn’t finish the course) also tend to spend longer time on scientific subjects.
However, only 5% of all enrolments end up in certifications. This is higher in Social science and Technology at 2% possibly due to higher amount of hours required of these courses.
Since Dec 2015, edX no longer offered free certifications, from this point certification rate dropped by more than a half from 8% in 15Q4 to 3% in 16Q1
.Audit rate is 10 percent point higher in Art subject at 22% than Science subject at 12% Overall only 17% of enrolment audited more than 50% of contents.
After 15Q4 certification charges, audit rate temporarily dropped from 24% to 15% then bounced back.
When we look at the evolution of individual courses, we can visualize the number of enrolment with size and certification rate as color using treemap.
We have these findings:
- Introduction to Computer Science from HarvardX is one of all-time favorite, however completion rate is also among the lowest.
- The ratio of people sticking through courses in Humanities are higher. Courses such as Communist Liberations, Creating China and Invasions, Rebellions, and the end of Imperial China are among the top 3 in certification rate. However, there isn’t granular student demographic data to reveal the reason.
- Within Social Science, Health and Society has high certification rate while The Analytics Edge (the first Data Science MOOC I took) has the most participants.
- Among Science and Engineering bucket, Super Earth and Life turns out to have higher proportion of people completing it.
Here are some of the most attended and completed courses, one from each category:
And here are some of the least attended and complete ones:
For a next step we could look at similar courses offered by different institutions and compare their participation rate or explore what keywords in the course subject are attractive or trending.
This is one of my practice project of data projects, You could view the code for today’s analysis here.