# Statistical Tests for Data Analysis Part-I

Jun 18, 2020 · 4 min read

These statistical tests allow researchers to make inferences because they can show whether an observed pattern is due to intervention or chance. There is a wide range of statistical tests. The decision of which statistical test to use depends on the research design, the distribution of the data, and the type of variable.

In general, if the data is normally distributed, parametric tests should be used. If the data is non-normal, non-parametric tests should be used. Below is a list of just a few common statistical tests and their uses.

# 1. Pearson Correlation

It is known as the best method of measuring the association between variables of interest because it is based on the method of covariance. It gives information about the magnitude of the association, or correlation, as well as the direction of the relationship.

The value r = 1 means a perfect positive correlation and the value r = -1 means a perfect negative correlation. So, for example, you could use this test to find out whether people’s height and weight are correlated (they will be — the taller people are, the heavier they’re likely to be).

Positive correlation indicates that both variables increase or decrease together, whereas negative correlation indicates that as one variable increases, so the other decreases, and vice versa.

Requirements:

• Scale of measurement should be interval or ratio
• Variables should be approximately normally distributed
• The association should be linear
• There should be no outliers in the data

where:

• n is sample size
• xi and yi are the individual sample points indexed with i
• x-bar and y-bar are respective mean values

# 2. Spearman Correlation

When data are measured on, at least, an ordinal scale, the ordered categories can be replaced by their ranks and Pearson’s correlation coefficient calculated on these ranks. Spearman’s Rank correlation coefficient is a technique which can be used to summarise the strength and direction.

The value r = 1 means a perfect positive correlation and the value r = -1 means a perfect negataive correlation. So, for example, you could use this test to find out whether people’s height and shoe size are correlated (they will be — the taller people are, the bigger their feet are likely to be).

Requirements:

• Scale of measurement must be ordinal (or interval, ratio)
• Data must be in the form of matched pairs
• The association must be monotonic (i.e., variables increase in value together, or one increases while the other decreases)

where:

• Sd2 is the sum of the squared differences between the pairs of ranks
• n is the number of pairs

# 3. Chi-Square

The Chi-square test is intended to test how likely it is that an observed distribution is due to chance. It is also called a “goodness of fit” statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent.

NOTE: A Chi-square test can tell you information based on how you divide up the data. However, it cannot tell you whether the categories you constructed are meaningful.

Requirements:

• The sampling method is simple random sampling.
• The variable under study is categorical.
• The expected value of the number of sample observations in each level of the variable is at least 5.

where:

• O is the observed value.
• E is the expected value.
• “i” is the “ith” position in the contingency table.

What is the Chi-Square test NOT for?

The Chi-square test is only meant to test the probability of independence of a distribution of data. It will NOT tell you any details about the relationship between them. If you want to calculate how much more likely it is that a woman will be a Democrat than a man, the Chi-square test is not going to be very helpful. However, once you have determined the probability that the two variables are related (using the Chi-square test), you can use other methods to explore their interaction in more detail…

Thanks for reading ! If you want to get in touch with me, feel free to reach me on abinmj656@gmail.com or in my LinkedIn Profile.

## The Startup

Get smarter at building your thing. Join The Startup’s +789K followers.

### By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

Medium sent you an email at to complete your subscription.

Written by

## Abin Joy

Secretly loves the story behind the data. :)

## The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

Written by

## Abin Joy

Secretly loves the story behind the data. :)

## The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

## Using Automation to Tackle Infinite Test Data Sets

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app