What is inferential statistics?

Inferential statistics make inferences, predictions or generalizations about a population based on a sample of data which is taken from the population.

The goal of the inferential statistics is to draw conclusions from a sample taken from the population and generalize them to the population.

With inferential statistics, you take data from samples and make generalizations about a population.

Inferential statistics

Let’s understand inferential statistics through an example:

If we want to do some research about the doctors, some of the examples of questions we would like to ask are as follows

  1. What is the average salary of a doctor?
  2. What percentage of doctors hold a specialisation degree?

To answer the above questions accurately we will have to conduct a census and survey kind of questionnaire and ask each and every doctor about the above questions, but as you may have realized we cannot ask every person in this huge population of doctors. Instead what we can do is pick a random sample out of the population and ask these questions to that sample of people and make inference or conclusion from the response received.

For the above-mentioned research questions, the population comprises of all the doctors on the planet earth. The key point here is deciding whether the observation on the small sample hold for the complete data or not. For example, if a doctor’s average salary based on the sample is found to be $100,00 does it hold true for the entire population of doctors? The inference is all about finding answers to such questions.

Let’s try and understand the words “Sample” & “Population” and break down the above statement using an example.

Let us consider Indian presidential elections it could be very useful to know the political leanings of every single eligible voter, but surveying every voter is not feasible. Instead, we could poll some subset (sample) of the population, such as a thousand registered voters, and use that data to make inferences about the population as a whole and this will make our work easier and we can thus make some conclusion out of it.

Statistical inference helps us to answer various such questions based on the samples drawn from the entire population.

There are two broad areas of statistical inference or inferential statistics as follows:

1. Statistical Estimation

2. Hypothesis Testing

Let’s understand both the areas one by one.

  1. Statistical Estimation:

Statistical estimation means taking a statistic from your sample data (for example the sample mean) and using it to say something about a population parameter (i.e. the population mean).

Following are some of the questions that can be answered using Statistical Estimation.

  1. What is the average salary of a doctor?
  2. What percentage of doctors hold a specialisation degree?

Here as you can see we are trying to find the values of population parameters based on the sample.

Population parameter and Sample statistic

A parameter is a descriptive measure of the population.

Example: Population mean, Population variance etc.

A statistic is a descriptive measure of the sample.

Example: Sample mean, Sample variance etc.

2. Hypothesis Testing:

Hypothesis testing is something which can be used to take sample data into consideration in order to answer research questions.

Following are some of the questions that we can answer using Hypothesis Testing

  • Is the salary of the doctor and his education independent of each other?
  • Is the average salary of a doctor in a particular place is greater than $100k?

Mostly, certain assumptions are made about the population parameters and hypothesis testing is a way to decide whether these assumptions stand true based on the data taken from that sample.

Data Science Aspirant