A job ready Data Science curriculum

I recently graduated from the MSc Business Analytics at the UCD Michael Smurfit Graduate School of Business in Dublin, Ireland. Getting us from having little to no programming ability, to being entry-level data science candidates in a year was a great achievement for the course, and I thoroughly enjoyed my time (which you can read more about here).

I’m moving back to Perth, Australia and have started the job hunt (hire me?), and looking at job specs I realise there’s a many areas of analytics that we didn’t cover. I’ve also spoken to classmates who have already come into contact with topics they’re scrambling to learn.

While no curriculum can cater for every taste and whim, and while my course packed a lot into a year, I’ve been thinking a lot about what expansions / changes could be made that would produce my ‘ideal’ curriculum for Data Science, with the hope of producing the most ‘job ready’ graduates for jobs in Data Science, Data Strategy, Data Consulting, Business Analytics and Web Analytics, all areas requiring a similar skillset.

Semester One — The basics

Units:

  • Introduction to Programming
  • Statistics and Simulations
  • Project Management
  • Calculus
  • Numerical Analytics and Software

Introduction to Programming

Introduce students to Python (or R) and SQL, and get quality up to a level where students are comfortable handling data with numpy and pandas, and understanding how to retrieve/edit data from databases. Very basic introduction to JavaScript, HTML and CSS.

Statistics and Simulations

Theoretical introduction to statistics and random sampling, up to parameter estimation and hypothesis testing. Practical applications using hacker statistics (simulation modelling).

Project Management

Agile IT project management discussion and critique. Understanding biases in collection and analysis of data. Git and collaborative software projects.

Calculus

Theoretical basis for understanding machine learning algorithms. Logic, linear algebra, matrices, vectors, differential calculus, optimisation.

Numerical Analytics and Software

Finite precision computing, algorithmic complexity, condition & stability, systems of linear equations, optimisation, deeper discussion of certain in algorithms mathematical context (SVMs etc).


Semester Two — Building the base

Units:

  • Introduction to Data Mining and Machine Learning
  • Big Data and Distributed Computing
  • Data Visualisation and Design
  • Research Methods and Analysis
  • Data Engineering and Strategy

Introduction to Data Mining and Machine Learning

Theoretical and practical introduction to supervised and unsupervised learning using either R or Python. Understanding popular algorithms for regression, classification, clustering and association mining, as well as overfitting and parameter estimation.

Big Data and Distributed Computing

Introduction to Spark and Map/Reduce and their APIs. Practical experience setting up and managing cloud servers, as well as sample system of Raspberry Pi’s in class.

Data Visualisation and Design

Expand on introduction to JavaScript, and explore the Django web framework for dashboards. Data vis and design theory, as well as practical examples with D3. Introduction to geospatial visualisation with GMaps V3 and Leaflet.

Research Methods and Analysis

Understanding how to read and critique academic papers, and how to conduct a review of literature.

Data Engineering and Strategy

Business experience in how to identify and produce data collection, aggregation, and analysis plans, as well as how to deploy machine learning models in Django. Aim to give students an understanding of the business strategy behind data stacks. Introduction to Blockchain and other innovations with significant business impact.


Semester Three — Advanced Topics

Units:

  • Network Analysis (Graph Theory)
  • Optimisation Problems
  • Natural Computing and Deep Learning
  • Web and Marketing Analytics
  • Real Time Analytics and IOT

Network Analysis (Graph Theory)

Understanding graph theory and how network models can be evaluated using different metrics and optimised. Focus on social network analysis.

Optimisation Problems

Introduction to Linear and Integer Programming for optimisation in Python, and types of optimisation problems that can be solved at scale, including TSP, constrained optimisation and goal programming.

Natural Computing and Deep Learning

Introduction to the field of nature inspired algorithms, with a specific focus on neural networks using Tensorflow or Keras. Understanding the types of optimisation problems suited for such methods, and gain experience in deploying neural networks on machine vision problems.

Web and Marketing Analytics

Introduction to Google Analytics and other data gathering and analysis services which are available for websites and advertising. Explore marketing metrics and KPIs, along with algorithms for improving the efficiency of targeted advertising and an introduction to SEO.

Real Time Analytics and IOT

Understanding the challenges of dealing with a real time data flow from multiple sources, and the challenges this poses for outlier detection and analysis. Build solutions which can handle this constant flow of information in an appropriate manner.


Semester Four-Dissertation or Practicum

Single extended research or practical project to satisfy Masters level study requirements. Focus on messy, real world datasets to build data munging and complementary data collection skills


I’m interested to hear your thoughts, is there anything else that you’ve seen that employers are looking for, or from data scientists, what skills do graduate hires lack that could be taught at a Masters level?

If you liked this post, please click and hold the 👏 button to let your followers know, or let me know your thoughts below or on Twitter @padams02.