Hypothesis Testing in R

Vicky
8bitDS
Published in
3 min readDec 7, 2022

Utilizing Hypothesis Testing with R: An Introduction to Statistical Significance Testing

credit: https://cran.r-project.org/web/packages/mcStats/readme/README.html

Hypothesis testing is a statistical method used to evaluate the validity of a hypothesis by testing it against a sample of data. The goal of hypothesis testing is to determine whether the sample data provides sufficient evidence to support the hypothesis, or whether the hypothesis is likely to be incorrect.

In hypothesis testing, we start by stating the null hypothesis, which is the hypothesis that we want to test. The null hypothesis is typically a statement of no effect or no difference between two groups. For example, the null hypothesis might be that there is no difference in the average heights of men and women.

Next, we state the alternative hypothesis, which is the hypothesis that we want to evaluate against the null hypothesis. The alternative hypothesis is typically the opposite of the null hypothesis. In our example, the alternative hypothesis might be that there is a difference in the average heights of men and women.

Once we have stated the null and alternative hypotheses, we collect a sample of data and use it to test the hypotheses. We use statistical tests to evaluate the sample data and determine whether it provides sufficient evidence to support the alternative hypothesis or whether it supports the null hypothesis.

“Hypothesis testing is the art of deciding whether the observed data support or contradict a particular hypothesis. In other words, hypothesis testing is the art of distinguishing between the plausible and the implausible.”

- Richard Royall

To perform hypothesis testing in R, we can use the t.test() function from the base R package. This function allows us to specify the null and alternative hypotheses, the sample data, and the type of test to use.

Here is an example of using the t.test() function to perform a hypothesis test in R:

In this example, we first load the tidyverse library, which includes the t.test() function. We then define the null and alternative hypotheses, load the sample data, and perform the hypothesis test using the t.test() function. Finally, we print the test results, which include the test statistic, the p-value, and the confidence interval.

The p-value is a measure of the probability that the sample data would have resulted in the observed test statistic if the null hypothesis were true. If the p-value is below a predetermined threshold, typically 0.05, then we reject the null hypothesis in favor of the alternative hypothesis.

In our example, if the p-value is below 0.05, then we can conclude that there is sufficient evidence to support the alternative hypothesis that there is a difference in the average heights of men and women.

In addition to the t.test() function, R also includes several other functions for performing different types of hypothesis tests. For example, the chisq.test() function can be used for chi-square tests, the prop.test() function can be used for tests of proportions, and the ks.test() function can be used for Kolmogorov-Smirnov tests.

One interesting fact about hypothesis testing in R is that the t.test() function can be used to perform a wide range of different tests, depending on the arguments that are provided to the function. For example, the t.test() function can be used to perform one-sample, two-sample, and paired-sample tests, as well as tests for variances and proportions.

Additionally, the t.test() function allows users to specify the type of test to use, such as a Student's t-test, a Wilcoxon rank-sum test, or a Welch's t-test. This flexibility makes the t.test() function a versatile tool for hypothesis testing in R.

--

--