3 Simple Steps to Perform Linear Regression in Excel

Dylan Song
Data And Beyond
Published in
4 min readAug 24, 2023
This is what the final scatterplot will look like!

Table of Contents

· Introduction
· The Dataset
· Step 1: Come Up with Your Statistical Question
· Step 2: Selecting the 2 Columns of Interest in Excel
· Step 3: Create the Scatterplot
· Want to See More?

Introduction

In this blog post, I’ll show you 3 simple steps to create a linear regression model in Excel. Linear regression is important because it can tell you important trends between two or more variables in the data. So, let’s get started!

The Dataset

The Kaggle dataset I’ll use in this post shows all the sale prices for the houses in the King County area (or the Seattle area). Here’s the download link, if you want to follow along: https://www.kaggle.com/datasets/harlfoxem/housesalesprediction

Photo by Thom Milkovic on Unsplash

Step 1: Come Up with Your Statistical Question

Before we create the model, we need to know what variables we’re interested in. For this post, my question is going to be “What is the relationship between the square feet in the living room and the price?

So, my variables are going to be the square feet and the price. You can choose your own variables, but you need to make sure that they’re continuous variables, or else you’re gonna run into a problem.

Now, we need to know what our independent, which is on the x-axis, and our dependent, which is on the y-axis, variables are. In this example, it makes sense to have the house price as the dependent variable because we’re assuming that the number of square feet in the house is going to affect what the price is.

In the next step, we’re going to hop over onto Excel.

Step 2: Selecting the 2 Columns of Interest in Excel

First, you have to select the second cell in the column that has all the square feets and then do Ctrl+Shift+Down:

Then, hold the Ctrl key and click on the second cell in the price column and do Ctrl+Shift+Down again:

Step 3: Create the Scatterplot

Now, go to “Insert” and click on this button over here:

But, there’s actually a problem:

You can see that all the prices are on the x-axis, which is NOT what we want.

So to fix this, you have to right-click on the graph and click “Select Data”:

This is going to open a new window that looks like this:

From here, click on “Edit”. This is going to open another window:

In the “Series X values” box, get rid of everything inside and starting from the 2nd cell, highlight all the values in the square feet column. In the “Series Y values” box, do the same thing but this time, highlight all the values in the price column.

So, the boxes should look like this:

Now, the scatterplot looks a lot better now, with the square feet on the x-axis and the prices on the y-axis:

Want to See More?

If you want to see me create a regression line over this scatterplot and analyze what the model tells us, check out my full video!

I hope to see you over there!

Originally published at https://dylans0ng.github.io on August 24, 2023.

--

--

Dylan Song
Data And Beyond

Data Science Enthusiast | Blogs on any data-related content! | FREE tutorials here: https://www.youtube.com/@dylan_song