Four steps to automating data analysis in Excel

Rohit Raj
Thrive in AI
Published in
5 min readJan 12, 2020

Today there is age of data analytics. Python is pre-dominant tool for data analysis. Each day there is new library for data analysis. However it is hard to beat Excel in data analysis for small data sets. If the tables that you need to create from your data keeps changing frequently. Then excel remains the tool to use for data analysis.

I was working on a project where both the data set and standard tables that I need to create, kept changing frequently. I started with excel formulas and kept finding ways to speed up the process of making them. In this article I will show how I improved data analysis process in excel

I will start with a simple example using formulas then show how you can handle widely changing data.

We have following data in data sheet of excel .

We have a list of users across different groups and locations. Columns Revenue and Expense contains revenue and expense to the company from each user.

We want to create following report

For each group, we want consolidated revenues and number of users. This is easily done by sumifs and countifs formula.

1 Using Formulas

We can create the above table by using excel sumifs and countifs formula. I will not go into details of these formula.

You can create revenue column using sumifs formula as follows

Similarly we can create Number of users column using countifs formula as follows

2 Using named ranges

If we have only one or two tables. This step is enough. But we have many tables. Say we have 30 -40 tables. All of which hundreds of formulas linked to data sheet. Then using named ranges can save lot of time any time you have to change data range across formulas.

You can create ranges by going through Formulas > Name Manager and defining range in popup window as follows

Using named ranges your worksheet formulas are much easier to read. And you can be sure that you are not making any referencing mistake. Anytime you make changes to named ranges all your formulas are automatically updated.

3 Automating changing named ranges using VBA

Now if you have only one or two named ranges, then you can update them manually when your data changes. But if you have lot of named ranges and your data is also changing frequently. Then you can use VBA macro to update named ranges automatically.

I used the following VBA code to update your named ranges when your underlying data changes.

You have to save following code in your excel file.

Public Sub dynamicrange()
Application.Calculation = xlCalculationManual
Dim i, j As Long
Dim new_range, x2, x3 As String
Dim Rng1 As Range
Dim name_range As Range
Set name_range = Selection
j = name_range.Rows.Count
For i = 2 To j
new_range = name_range(i, 6)
x2 = name_range(i, 1)
x3 = name_range(i, 2)
Set Rng1 = Sheets(x3).Range(new_range)
ActiveWorkbook.Names.Add Name:=x2, RefersTo:=Rng1
Next i
Application.Calculation = calcState
End Sub

Then we have to save the following data in new sheet of your file. You have to name the sheet as ‘macro’ to use code in step 4.

Now you can select the above range and run dynamic range macro given above. It will update your named ranges.

Many times the situation arises when number of rows in your data has changed. In this situation you have to update all your named ranges. In such a situation you can update all named ranges at once, using the above macro.

4 Automatically reading Excel files

Even after these three steps I was not satisfied. Who wants to copy paste your data in file created after the first three steps. So I wrote a VBA macro to copy the data from data file to your template file created after the first three steps. After data is copy pasted the macro calls the dynamic range macro in step 3 to update all named ranges. The dynamic range macro updates all tables in template file and finally the template file is saved as a new file in the output folder. This step is repeated for all data files in input folder and for each data file a separate output file is saved in the output folder.

Summary

The benefit of this approach is after going through all these steps, you can automate creation of any number of excel tables. Anytime you add a new table or change an existing table. It is as simple as changing formulas of an existing table.

Drawback is that you cannot process big data in this way. But if your data is small and consistently changing with large number of tables, then this is an optimum way.

Next I implemented this whole data analysis in python. I will write about that in next article.

--

--

Rohit Raj
Thrive in AI

Studied at IIT Madras and IIM Indore. Love Data Science