How to learn Statistics

Chitranjan Gupta
3 min readJun 22, 2022

--

A quick yet complete roadmap to work on your Statistics skills 👏👏

Source: Canva

Most data science techniques are built on math and statistics. As a data scientist, you will need a solid understanding of these core concepts to perform exploratory data analysis, forecast future events, and effectively apply machine learning and deep learning.

Through this article, I want to help you, the newborn data enthusiasts, looking to make a career in data science. I’m assuming that if you opened this article, you probably are wondering on how should you proceed towards learning the statistics in order. Here, I’m listing out all the concepts that you will need in your career, but remember, that you don’t need to learn all the concepts in a go.

I believe, the best approach to work on your statistics is to start working on a novel data science project and wherever you need your statistics to be implemented, take references from your favorite resources, websites or mentors. The skill of finding the right answers on a huge ocean of internet, is the best skill works here (#opinion).

Coming back to the agenda of this article, here is my idea on how you should proceed towards learning statistics, if you plan to.

Firstly, let me divide the complete statistics concepts in 3 different levels:

  1. Basic Concepts
  2. Intermediate Statistics
  3. Advanced Statistics

Basic Concepts👏

Start by learning the basic terminologies in statistics and accordingly follow the concepts as listed below for descriptive statistics:

  1. Variables and Random Variables
  2. Population and Sample, Population mean and Sample mean
  3. Sampling Distribution
  4. Measures of Central tendency (Mean, Median and Mode)
  5. Measures of Variability (Range, IQR, Variance and Standard Deviation)
  6. Skewness and Kurtosis
  7. Gaussian or Normal Distribution

Here is a cheat-sheet on descriptive statistics for your reference: access here

Intermediate Statistics👏

Once you have the understanding on the basic concepts and you feel comfortable around the statistics, you can proceed to the following topics:

  1. Standard Normal Distribution
  2. Z-scores
  3. Probability
  4. Central Limit Theorem
  5. Confidence Interval and P value
  6. Type 1 and Type 2 errors
  7. Hypothesis testing
  8. One-tailed and Two-tailed tests
  9. Z-test, T-test, Chi-square test, ANOVA test (F-test)
  10. Covariance
  11. Pearson Correlation and Spearman Rank Correlation

Here is another cheat-sheet, in reference to Inferential Statistics: access here

Advanced Statistics👏

At this point, consider yourself a conceptually clear about the statistics which is implemented in over 90% of the data science projects. However, there are a few more advance statistics concepts, mostly different kinds of distributions that define the functioning of complex scenarios.

  1. Log Normal distribution
  2. Bernoulli distribution
  3. Binomial distribution
  4. Poisson distribution
  5. Power law distribution
  6. Chebyshev’s inequality
  7. Q-Q plotting

Honestly, many individuals simply run statistical functions through Python and R libraries to implement in their project, and most of them don’t even have to consider the underlying math.

However, understanding the fundamentals of statistical analysis allows you to take a more strategic approach.

I hope the above roadmap and suggestions help you picture your journey with statistics. Please do let me know if I might have missed any topic, or if you like this article.

Happy Learning! 👏👏👏

--

--

Chitranjan Gupta

Health Data Analyst. I talk about data analytics, visualizations, statistics, programming, machine learning, and a lot more.