STATISTICAL TESTING

Kallepalliravi
Analytics Vidhya
Published in
3 min readOct 2, 2020

Overview of most common Statistical tests

Image Credits : memegenerator.net

INTRODUCTION

Before moving into statistical testing, lets understand what statistic means. For that we need to understand what population and sample means.

Population: Includes all elements of interest

Sample: Subset of observations from a population

Measurable outcome from population is called a parameter, measurable outcome from a sample of the population is called statistic.

For example if you want to know average height of all men in United States. If you take a survey of each and every man in United states and take an average of all the heights then its called parameter. Since this is practically very hard, you collect data randomly from subset of men in the United states and calculate the average height of the sample then its called a statistic.

Statistical test is to make a decision on population parameter by using subset of data from the population. This is done through hypothesis testing. You can review hypothesis testing from the below article.

STATISTICAL TESTS

In this section we will go over the most commonly used statistical tests. This article will just give an overview rather than going into the math details behinds all these test.

ASSOCIATION TESTS:

They measure the association between 2 variables.

Measure how strong the variables are associated with each other. It ranges from -1 to 1. Correlation of 1 mean very strong association, Correlation of 0 means very weak association, Correlation of -1 mean there is a negative association.

Pearson Correlation: Tests for correlation between two continuous variables.

Spearman Correlation: Tests for correlation between two ordinal variables.

Chi-Square : Tests for correlation between two categorical variables.

COMPARATIVE TESTS:

They measure the difference between the 2 variables.

One Sample t-test: This test is performed when you want to compare the mean from a sample or sample mean to the population mean.

Two Sample t-test: This test is performed when you want to compare the means from two independent variables.

Paired Sample t-test: This test is performed when you want to compare the means of two related variables. Related groups can be similar pair across groups or the same group measured twice, like once before change and once after change. This is good test to do Pre-Post analysis.

Analysis of Variance (ANOVA): This test is performed when you you want to simultaneously compare the means of more than 2 independent groups.

PREDICTIVE TESTS:

They measure if change in one variable predicts change in another variable.

Linear Regression: This test evaluates the cause an effect relationship. In this test you are predicting an outcome (“dependent variable” — effect) from one or more predictors (“independent variable” — cause). If there is only one predictor variable then its called simple linear regression, if there are two or more predictors then its called multiple linear regression.

NON_PARAMETRIC TESTS:

These test are used when the data do not meet the assumptions of parametric tests like data has to be normally distributed. These are distribution free requirement tests. Parametric test assess group of means, Non-Parametric tests assess group of medians.

Wilcoxon rank-sum test: This test is performed when you want test the difference between two independent variables. This test takes into account of both magnitude and direction of difference.

Wilcoxon sign-rank test: This test is performed when you want test the difference between two related variables. This test takes into account of both magnitude and direction of difference.

Sign test: This test is performed when you want test the difference between two related variables. This test takes into account only the direction and not magnitude difference.

Conclusion:

In this article we have reviewed

  1. What is statistical testing?
  2. Overview of different kind of statistical tests.

--

--