Statistics Part 2: Inferential Statistics

Avinaba Mukherjee
5 min readDec 24, 2022

--

Pic: www.freepik.com

Introduction:

The portion of mathematics that work on the accumulating, analyzing, predicting, and presenting the numerical data is known as Statistics.

Statistics is primarily categorized into two categories:

· Descriptive Statistics

· Inferential Statistics

In the previous blog I have written about descriptive statistics in this writing I will discuss about Inferential Statistics.

I will request to you all to read the first part before reading this, because I have discussed there sample & population and the concept of these will be required for inferential statistics.

Here is the link of the previous blog:

Now let us start from very scratch of Inferential Statistics.

What is Inferential Statistics?

Inferential statistics is an implemented method of statistics that makes conclusions (inferences) about the population data using the sample dataset.

E.g.: Average number of hours of Kinder Garden kids spend on watching cartoon in your state.

You can collect the data in two ways

1. By visiting every Kinder Garden school of your state. (Population)

2. By visiting every Kinder Garden school of your city. (Sample)

1st way is too much time taken and expensive in respect of 2nd way.

There also a possibility of the result of average number of hours of kinder garden kids spend on watching cartoon in your city is (a) extremely high or (b) extremely low, due to the facilities and life style of your city.

In inferential statistics I will discuss probability, probability distribution, hypothesis testing, t-test, chi-square test and ANOVA test in this blog.

Probability:

Definition of probability can be given as that the likeliness of happening or occurring an event.

As an example, we can take, if you want to draw a card from a standard deck of card then what is the percentage of being it a red card?

Here the solution is.

So,

Total number of cards n(S) = 52

Number of red card n(A) = 26

Hence, P(Red)=26/52 = ½ = 50%

So, the probability of being it the red card is 50%.

· Random Experiment:

A random experiment is a procedure through which experiment we see uncertain outcomes.

E.g., Rolling a dice

· Outcome:

Result of a Random Experiment.

E.g., If we roll a dice then outcome can be any one number of the dice {1,2,3,4,5,6}

· Sample Space:

Set of all probable result.

E.g., All the probable result of rolling a dice, à {1,2,3,4,5,6}

· Trials:

In a repeated Random Experiment, each Random Experiment is known as Trials.

· Event:

A subset of Sample Space.

E.g., Even & odd sets of the Random Experiment. Odd à {1,3,5} ; Even à {2,4,6}

· Random Variable:

Random Variable is numerical description of the Sample Space.

Probability Distributions:

Probability distribution is a list of all possible outcomes with their corresponding probability values of a Random Variable.

E.g.,

Types of Probability Distribution:

· Uniform Distribution

· Bernoulli Distribution

· Binomial Distribution

· Poisson Distribution

· Normal Distribution

Confidence Interval:

The range of values that is likely to include a population value with a certain degree of confidence is called Confidence interval. It is expressed in “%”.

E.g., 90% confidence interval for the population mean height of females (130 cm -150 cm) indicates that it is 90% confident that the mean height of female lies between 13 cm to 150 cm.

Hypothesis Testing:

Hypothesis testing is an act in statistics where researchers assume a result and propose it for the sake of argument so, that it can be verified to observe if that is true or not.

Types of Hypothesis:

A hypothesis is classified into two types:

· Null Hypothesis

o A declaration about a population parameter

o the population parameter (mean, variance etc.) is equivalent to the assumed value

o Denoted by H0.

· Alternate Hypothesis

o A declaration that straight opposes the Null Hypothesis

o the population parameter is different from the assumption

o Denoted by Ha.

E.g., We want to test if the mean marks of pupils in mathematics are different from 40 (out of 50). The null and alternative hypotheses are

H0 μ = 2.0

Ha μ≠ 2.0

p-value:

The numeric value that is used to reject or accept the null hypothesis is termed as p-value. 0.05 is the most commonly used p-value.

o If p ≤ 0.05, reject the null hypothesis.

o If p > 0.05, accept the null hypothesis

z-test and t-test:

z-test formula:

t-test formula:

Chi-squared test:

When we compare observed result with the expected result of the categorical data then we perform Chi-square test.

It is generally performed when sample size is less than 50.

ANOVA Test:

To check if the means of two or more sets are significantly different from each other Analysis of variance (ANOVA) is performed

It tests the influence of one or more reasons by comparing the means of different samples.

This is a whole overview on inferential statistics, in the upcoming writing I will drill down to each test. 😊😊👍👍

follow me for the next writings.

--

--