Groceries Insights: Trying to Improve my Life Through Data Analysis

Alexistats
3 min readJul 2, 2022

Data analysis is a wonderful discipline that helps businesses around the world make decisions and hopefully, those decisions translate into positive outcomes for the company (usually through additional dollars, conversions, etc.).

As a data analyst, I realized that if I can help businesses make informed decision with the help of data, why couldn’t I use data to make better informed decisions in my daily life? That is when I decided to record every one of my grocery store receipts into a spreadsheet to one day, analyze. And the day is… now!

My ultimate goal is to identify obvious patterns in my purchasing habits, so that I can take advantage of this knowledge to reduce unnecessary spending and/or buy more healthy food if possible. For example, I am wondering if I spend more money at certain hours of the day, or on certain days of the week. Avoiding those times and days might help me keep my expenses down — maybe I spent more when it’s supper time and I am hungry, or perhaps on the weekends when I have more time to explore the aisles.

This project will be a lot of data exploration in Tableau, and I will build a story of my grocery trips behaviour through data visualization. These articles will help me document this project and the process of building different data visualizations in Tableau.

Following are an explanation of the dataset, as well as a list of questions that I am looking to answer through this exercise.

Methodology for Collecting the Data

I have recorded every one of my grocery store receipts from January 8th 2020 to February 21st 2021. Being human, I might have missed a few, but I have done my best and hopefully we can pick up any missing one while exploring the data over time.

How did I record the data, and which data did I record?

I entered the data in a spreadsheet document (this one — link). For each grocery trip receipt, I recorded each item on one row. There are 81 receipts and 930 rows in the dataset.

Columns in the Dataset:

  • Date: The date of the trip.
  • Time: The time on the receipt. Unfortunately I did not record the time I departed from home, only the time when I completed my purchase.
  • Store: The store’s name. Could be used to see which stores are pricier or offer more discounts.
  • Brand: The item’s brand. Added when possible. Skipped for fruits/vegetables and other items where the brand was not obvious.
  • Item: The item’s name. I tried to keep things consistent, but We might discover discrepancies, since I named the items by myself.
  • Quantity: How many of the item did I purchase.
  • Price (total): The total expenses on that item.
  • Discount: When available, recorded the discount applied on the item in dollar amount.

In addition to recording those columns, I have added the following three columns for better understanding the data:

  • Unit Price: Price/quantity. The price per item.
  • Food_yesno: Is the item a food item. I did record everything, and I would like the ability to exclude non food items.
  • Is_healthy: Arbitrary definition of what I consider healthy. My rule of thumb is that non-processed items (fruits, veggies, meats, etc.) were considered healthy, while processed foods, high-sugar content foods were considered not healthy (cookies, alcohol, pizza, etc.).

Questions to Answer in Future Articles

  • What day of the week, time of day did I shop the most often?
  • What day of the week, time of day did I spend the most on average?
  • What is my favorite product?
  • What is my favorite brand?
  • Which was my favorite store, which one was cheaper, which one had more discounts?
  • How often did I purchase at discount? Was there a day of week or time of day where I purchased more or less at discount?
  • Does more discount always mean a lower price (between different stores) for the same item?
  • How much did certain items cost me over the year (alcohol, cheese, unhealthy food, healthy food)?

Next Article

In my next article, I will explain how to build a bar chart in Tableau to answer the first two questions above — that is exploring my purchasing habits by time of day and day of week.

--

--

Alexistats

I am a data analyst with a math background, and I love hockey!