Member-only story

How to Tell Among Two Regression Models with Statistical Significance

Diving into the F-test for nested models with algorithms, examples and code

LucianoSphere (Luciano Abriata, PhD)
TDS Archive
9 min readJan 3, 2025

--

Introduction

When analyzing data, one often needs to compare two regression models to determine which one fits best to a piece of data. Often, one model is a simpler version of a more complex model that includes additional parameters. However, more parameters do not always guarantee that a more complex model is actually better, as they could simply overfit the data.

To determine whether the added complexity is statistically significant, we can use what’s called the F-test for nested models. This statistical technique evaluates whether the reduction in the Residual Sum of Squares (RSS) due to the additional parameters is meaningful or just due to chance.

In this article I explain the F-test for nested models and then I present a step-by-step algorithm, demonstrate its implementation using pseudocode, and provide Matlab code that you can run right away or re-implement in your favorite system (here I chose Matlab because it gave me quick access to statistics and fitting functions, on which I didn’t want to spend time). Throughout the article we will see examples of the F-test for nested models at work in a couple of settings including some examples I built into the example Matlab code.

The F-test for Nested Models

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

LucianoSphere (Luciano Abriata, PhD)
LucianoSphere (Luciano Abriata, PhD)

Written by LucianoSphere (Luciano Abriata, PhD)

https://www.lucianoabriata.com | Scientific writing, technology integrator, programming, biotech, bioinformatics.| Have a job for me? Contact me in ES FR EN IT

No responses yet