IGNOU’s Post Graduate Diploma in Applied Statistics (PGDAST): Why A Data Science Enthusiast Must Care About it?

Akshay L Chandra
7 min readJun 22, 2018

I work as an Associate Software Engineer for GGK Technologies in the Business Intelligence team (Data Science wing). I finished the PGDAST programme in June 2018 with a final score of 85%.

My objective here is to try and break down each and every course offered in the programme and show how they can relate to a learner in the data science field.

What is Applied Statistics?

Applied statistics is a term commonly used to name courses for the non-mathematically oriented audience, that teach you how to apply statistical tools for the purpose of data analysis etc. Statistics is concerned about statistical problems, while applied statistics is concerned about using statistics for solving other problems.

If it’s not already clear, this programme has a lot less to do with proofs and derivations in statistics and it emphasizes more, and by more I mean, ONLY on the applications part (problem-solving) in statistics. However, you will find proofs and derivations here and there in the material offered by IGNOU but they are either mathematically heavy to follow or light to understand and the steps are not as straightforward as they should be. Moreover, the assignment questions and the questions in the term end exams (TEE), only test you on the applications part i.e., problem-solving. If you only enjoy applying a concept when you have a full understanding of the proof, you will probably treat the IGNOU material with disdain.

My proof is my business, none of your business.”

What courses does PGDAST offer?

Core Courses:

MST-001: Foundation in Mathematics and Statistics
MST-002: Descriptive Statistics
MST-003: Probability Theory
MST-004: Statistical Inference
MST-005: Statistical Techniques

Elective Courses:

MSTE-001: Industrial Statistics-I
MSTE-002: Industrial Statistics-II

Personal Note:
One more specialization,
Bio-Statistics is likely to be added to the programme soon. Until then, you have no other choice but to choose Industrial Statistics specialization, as of June 2018.

Lab Courses:

MSTL-001: Basic Statistics Lab − Core Course
MSTL-002: Industrial Statistics Lab − Elective Course

Breaking Down The Courses One By One!

MST-001: Foundation in Mathematics and Statistics

This course presents the fundamentals of elementary mathematical tools and basics of statistics. The concept list includes sets, functions, progressions (and counting techniques), calculus, linear algebra and presentation of data (you know, bar graphs and pie charts etc.)

The concepts and problems will take you back to school and they are not challenging at all. Even though the course includes calculus and linear algebra, which are the most important tools a learner in data science field must possess, they are straight out of a 10th-grade math textbook and not fancy enough for data science applications. So this course will just warm you up to face what’s coming next.

Just a warm up course!

MST-002: Descriptive Statistics

This course familiarizes the learners with the basic methods of analyzing different types of data, which I think are very important for anyone dealing with any kind of data. The concept list includes measures of central tendency and dispersion (mean, variance and stuff), skewness and kurtosis, correlation, regression and theory of attributes.

The course describes ways to analyze univariate distributions like how to compare them with one another etc. It introduces the fundamental concept of statistical relationship between two variables and also discusses how to measure the strength of the statistical relationship between them and other things you need to know to make useful sense out of the data given to you. It also briefly talks about ‘Regression Analysis’, the first real-world problem a learner in ML/DL/DS field solves, or so I think.

MST-003: Probability Theory

Half of this course discusses basic concepts of probability — like the laws of probability, Bayes’ theorem, random variables, mathematical expectation et cetera. The other half discusses a bunch of probability distributions (from the good-old-friendly Binomial distribution to not-so-friendly Beta and Gamma distributions), yes the very same ones we learned in high school but with more examples and hypothetical situations where you can use the concepts to come out of those situations with a win.

Personal Note:
Even though I solved more rigorous problems on the same probability distributions in school, it was very satisfying for me to rediscover the fact that there are patterns everywhere i.e., every outcome of every experiment/action belongs to some probability distribution or another and we just happen to study the frequently occurring ones.

That’s me when I could finally understand the intuition behind Beta and Gamma distributions!

MST-004: Statistical Inference

This course will help you understand some useful techniques of statistics to draw inferences about the population on the basis of sample(s). It discusses basic concepts of sampling distributions with their application, estimation theory, parametric and non-parametric testing of hypothesis.

Personal Note:
I always knew that statisticians drew inferences about the population using just the sample data but I could never see it in my head. I solved a lot of “Testing of Hypotheses” problems when I took the statistics course in college and I thought I knew how it worked but I was wrong. Thanks to this course, for the first time in my academic life I could actually visualize the whole idea behind the terms “statistical inference”, the estimations and the statistical tests, the motivation behind them, the execution and why certain conclusions are made the way they are. And I hope my statistics professor never reads this.

MST-005: Statistical Techniques

This course is designed to acquaint the learners with the statistical techniques used for sample surveys and their analysis, analysis of variance (ANOVA), design of experiments, some useful methods of generation of random numbers and applications of simulation techniques.

If you have spent some time solving real-world data science problems, you are likely to agree with me that ANOVA is a powerful tool to come up with a lot of cool insights from the data — like testing effects of variables on one another, effects of interactions between them on some other variable, the relationship between the variables etc. and you get to learn all about it in this course.

MSTE-001: Industrial Statistics-I

As the name suggests, this course is centered around the tools of statistics that are used to work with industrial and business data. The first half of the course discusses statistical quality control — process control and product control. The second half introduces the learners to the mathematical study of strategies for optimal decision-making i.e., Decision Theory and Game Theory!

If you haven’t heard of the above two beautiful fields of study in mathematics already, do a quick research or watch the video I attached below. I’m sure you will notice how simple mathematics can help us get the best out of the choices at hand. For a data scientist, who is focused more on making business sense out of analytics, will definitely be able to leverage the concepts of game theory to derive strategic decisions from raw data.

Always go for the second best — John Nash

MSTE-002: Industrial Statistics-II

I consider this course as the most important course in the entire programme from a data lover’s perspective. The second half of the course discusses regression modelling and time series modelling, and I think I need not emphasize the role they both play in the world of data science and if you have no idea what I am talking about, I urge you to check the following posts — Regression Modelling and Time Series Modelling.

The first half of the course lets the learners peek into the idea of Operation Research (OR), which “is a discipline that deals with the application of advanced analytical methods to help make better decisions. Employing techniques from other mathematical sciences, such as mathematical modeling, statistical analysis, and mathematical optimization, operations research arrives at optimal or near-optimal solutions to complex decision-making problems.” — Wikipedia. However, as I mentioned earlier, the material only emphasizes on just the steps/rules to solve various kinds of ‘complex’ decision making problems and has no proper literature to help the learners understand the mathematical stories behind the methods included.

Personal Note:
The material doesn’t even try to give the learners an intuition on why a particular method works. So I would ask you not to expect much of ‘mathematical learning’ from the official material w.r.t this course. YouTube has a series of long but brilliant lectures on OR by Prof.G.Srinivasan, IIT Madras
. So it wasn’t very hard to find the ‘mathematical stories’ I wanted to hear.

Lab Courses

  • MSTL-001: Basic Statistics Lab
  • MSTL-002: Industrial Statistics Lab

These lab courses are designed with the exercises based on the theory from the core and elective courses of the programme. Everything is done in MS Excel.

The lab exercises include graphical representations, correlation analysis, hypotheses testing, ANOVA, process control charts, regression analysis, time series analysis etc. They are fun.

Note:
I have compiled all the useful resources concerned with this programme, like the study material, the previous question papers, the assignments, the formulae booklet and the programme guide (must read) and put them all here.

Conclusion

Apart from all the common issues I had to face as a first time part-time student, I enjoyed all the courses offered. I could see the way how my interpretation of various data science concepts, models and real time results improved after enrolling in the programme and I recommend it not just for the courses it offers but also for the way it helped me think statistically and look at the problems I deal with more mathematically.

Thank you for reading such a long post.

Live and let live!
A

--

--