We love Excel! We just need to use it better. Image from Pixabay.

Easy, Insightful Regression Models in Excel? No way.

But it’s true: Valuable data insights in under 5 minutes, guaranteed.

Aditi Mahabal
Published in
3 min readJun 5, 2020

--

Introduction

Excel is a powerful data manipulation and analysis tool that, once mastered, can entirely replace the lengthy and confusing coding that comes with data analysis in Python, R, and the like. This article is an example of one of those Excel tools: creating single and multiple linear regression models that output a plethora of neatly arranged statistics, graphs, and insights, with the click of just a few buttons.

Regressions are used to model the relationship between one or more independent variables and their corresponding dependent variable. This is done by fitting a line (“line of best fit”) between the variables, which can aid in predicting future outcomes — regressions are a handy data science tool, made even simpler with data analysis in Excel.

Tutorial

Begin by storing all independent/x-values in one column (if using multiple independent variables, store each in its own column), and the dependent/y-value in a separate column. The small table should look something like this:

Next, head to the “Data” tab, and click on the “Data Analysis” button in the upper right corner of the screen. Don’t see a Data Analysis button?

  1. At the top of your screen, select “Tools”, and click on“Excel Add-ins”.
  2. Check the box next to “Analysis ToolPak” and click OK.

The Data Analysis button should appear immediately. After having clicked on the button, scroll and select “Regression” from the list (and click OK).

Now, there are several different boxes and inputs in the pop-up on the screen, but don’t be alarmed! The process for inputting data is easy. Begin by clicking on the mini-sheet icon in the “Input Y Range” box, and select the ENTIRE column of data, INCLUDING the column name. Click enter, and the data range appears in the input box. Repeat the same process for the “Input X Range” box, with either one single or multiple columns selected. Check the boxes selected in the above screenshot, and hit OK.

And that’s it! The regression should pop up in a new sheet, unless otherwise specified — and with a LOT of statistical data.

The most important numbers here are included in the “Regression Statistics” box: Multiple R and R Squared return the correlation strength/accuracy of the regression, and are important to evaluating if the linear regression is a good model. The graphs are helpful in visualizing the same concept, and are great ways to explain the regression to a non-technical audience.

Conclusion

As demonstrated above, creating linear regressions in Excel is VERY easy. All of the returned statistics, tests, and graphs would have required endless manual coding on other platforms, but were generated in under 5 minutes using Excel’s Data Analysis tool. I hope this article helps speed up your data analysis, and does the capabilities of Excel some justice!

--

--

Aditi Mahabal
The Startup

University of Virginia undergraduate. I write about data science and cool projects I’ve done! linkedin.com/in/aditimahabal/