MS Excel — Accountants Should Understand These Basic Data Management Techniques!

“I don’t know” is no longer acceptable…

Excel >> Data Ribbon >> From Text/CSV

Overview

I received an email last week from a CPA client of mine who asked me…

“I have a REALLY large text file I downloaded from the IRS website listing exempt organizations. Is there a way I can get this into Excel and use it?”

It’s 2018. Every accountant should ABSOLUTELY UNDERSTAND how to approach this simple data management challenge!


Dataset Review

There are 4 basic steps to consider (consider the “ETLR” process):

Extract and Transform

  • Download or connect to the dataset.
  • Import the data into Excel and “shape” it.

Load

  • Load the final data into Excel.

Report

  • Prepare analysis.

I looked at the dataset and quickly figured out that the text file download is >76 megabytes. This means is is LIKELY going to have more than 1,048,576 and I won’t be able to load it into an Excel worksheet.

No problem. And we will do this in LESS THAN 10 MINUTES!

Stepping Through The Process (less than 10 minutes!)

EXTRACT The Dataset

TIME — 1 minute, 30 seconds (01:30)
IRS Website — Here’s the Download Link (ZIP file)
  • Download the ZIP file and extract the .txt file.
ZIP file and extracted .txt file

TRANSFORM the Dataset (using Excel’s Power Query Tool)

TIME — 5 minutes (05:00)
  • Open Microsoft Excel (I’m using Excel 2016) and connect to .txt file to import it.
Excel >> Data Ribbon >> From Text/CSV
Navigate to folder and select .txt file
  • Data layout screen — select “Transform”.
Connect to Data >> Click on “Transform Data”
  • Power Query editor opens — start transformation process.
Power Query edit view — Start transforming data
  • Completed “basic” transformation. Time to “Load” the data.
Dataset is now modified to proper format — Time to Load it

LOAD The Data (Into Excel’s “Data Model”)

TIME — less than 1 minute (01:00)
  • I will load this data into the Data Model — this loads your data into the “background” in Excel. I AM NOT restricted on number of records since I am not loading it into the Excel worksheet (i.e., I won’t lose any data).
Excel Power Query “Import Data” screen.
  • 1,067,173 records are loaded into the Data Model. Had I loaded this into the Excel worksheet, over 20,000 records would be dropped.
Data imported into the Data Model

REPORT On The Data (Develop Analysis)

TIME — 2 minutes (02:00)

As shown above, to start analysis, simply “Insert” a Pivot Table and Excel automatically expects that you will use the data contained in the Data Model and makes that selection for you.

Then just start getting the information you want!

  • In the pivot table below, I have indicated the COUNT of exempt entities in Ohio, by City.
Basic Pivot Table — Count of locations by City within the State of Ohio

If this is new to you, it may be confusing.

But in 2018, this is a very BASIC dataset and Excel data conversion process.

If you want to look at the Excel file, you can download it here.

Now, you now longer have to say “I don’t know…”

About Don

“It’s time for DIFFERENT”

Don is passionate about helping professionals and organizations keep up, develop and adapt to the changing business world that we operate in.

“What Do You Do?”

Connect with Don!

LinkedIn, Flipboard, Twitter, Snapchat