The Journey into Time Series: Beginning with Descriptive Analytics

Data Mastery Series — Episode 11: The Art of Forecasting (Part 2)

Donato_TH
Donato Story
6 min readJul 1, 2023

--

If you are interested in articles related to my experience, please feel free to contact me: linkedin.com/in/nattapong-thanngam

In a creative departure from the norm, we’re embarking on an illustrative journey today, weaving a distinctive tale that seamlessly intertwines with our theme of time series forecasting. To make our subject more engaging and accessible, we’re using storytelling and visual aids.

Our narrative revolves around Donato, a regular patron of the Robust Roast coffee shop who enjoys their signature brew, the “Optimizer Shot”. As Zara, the shop owner, notices Donato’s regular visits, she strikes up a friendship. Through their casual conversations, she discovers that Donato is a data science and analytics aficionado. Recognizing the potential of these disciplines to uncover customer behavior insights and enhance profitability, Zara asks Donato to mentor her.

Robust Roast: A Tale of Beans and Bytes — Part 1 ( Image by Author)
Robust Roast: A Tale of Beans and Bytes — Part 2 ( Image by Author)

Let’s indulge in a thought experiment. If you were Donato, where would you commence your coaching journey with Zara?

I propose that the optimal launching pad for our data-driven exploration is descriptive analytics. Let’s delve deeper:

What is Descriptive Analytics?

Descriptive analytics is a process that deciphers historical data to identify patterns and trends, essentially answering the question, “What happened?” A critical element of descriptive analytics is Univariate Analysis, which dissects a single variable to elucidate its patterns and distribution, without considering any interactions with other variables.

What are the Objectives/Benefits of Descriptive Analytics?

  • It provides valuable insights into past behaviors, helping infer their potential influence on future outcomes.
  • It illuminates patterns and trends within the data, enabling more informed decision-making.
  • It highlights areas requiring improvement, laying a robust foundation for strategic future planning.

To breathe life into our narrative, I’ve simulated transaction data, a common type of dataset across various industries. The table consists of the transaction date, the SKU that was sold, and the quantity sold.

Simulated Dataset (Image by Author)

“To kickstart our exploration of Descriptive Analytics, let’s begin with some fundamental visualization techniques. All of the code and raw file are available in my GitHub repository [Link]

1. Frequency Distribution Tables

These tables display the count of various values or categories in a dataset. Key Components: Resembling a pivot table, the tables comprise columns for the sales volumes of products A, B, and C, with rows representing monthly sales. The values denote the frequency of sales volume for each product at each sales level.

Frequency Distribution Tables (Image by Author)

2. Heat Maps

While frequency distribution tables summarize the data, heat maps offer a more comprehensive understanding using color gradients. A heat map showcases the relationship and intensity between various data values using color gradients. It’s like a frequency distribution table but uses color gradients for easier interpretation, effectively revealing patterns, relationships, or correlations within a dataset.

Heat Maps (Image by Author)

3. Pie Charts

A pie chart is a circular graphic showing the proportion of different categories within a dataset. Each slice of the pie chart represents a category, and the size of the slice indicates the category’s proportion or percentage within the dataset.

Pie Charts (Image by Author)

4. Bar Charts

A bar chart compares different categories or values using rectangular bars of varying heights. The bar chart can be horizontal or vertical, with each bar’s height representing its value or frequency. It can also illustrate the values from the Frequency Distribution Tables, providing a more detailed understanding of each product’s behavior.

Bar Charts for 3 SKU comparison (Image by Author)
Bar Charts for showing the frequency of sale for each SKU (Image by Author)

5.Histograms

A histogram showcases the distribution of continuous numerical data by categorizing it into ‘bins’ or intervals. Each bar of the histogram represents a bin, and the bar’s height signifies the frequency within that bin. Adjusting the bin size can provide more insightful visuals (in this case, we assume 15 bins).

Histograms (Image by Author)

6. Kernel Density Estimation (KDE)

KDE plots estimate a continuous variable’s probability density function by smoothing the data with a kernel function. A KDE plot displays a continuous line representing the estimated probability density of the data, providing a smoother visualization of a histogram.

Kernel Density Estimation (Image by Author)
Adjusted Bin Size (Image by Author)

7. Box Plots

A box plot (or box-and-whisker plot) exhibits data distribution through quartiles and identifies outliers. The plot features a box representing the interquartile range (IQR), whiskers indicating the data’s range excluding outliers, and points or lines highlighting outliers.

Here’s how to read a box plot by comparing it with a histogram: The graph overlays a dotted line representing the min, 25th percentile, 50th percentile, 75th percentile, and max values. (Typically, a box plot can show outliers, but this data doesn’t have any.)

Histograms vs. Box Plots (Image by Author)

Box plots streamline data comparison. While comparing three products using histograms could be complex, box plots simplify this task.

Box Plots for 3 SKU comparison (Image by Author)

8. Line Charts

A line chart illustrates trends and changes in a variable over time. A line chart connects data points with a line, with the x-axis representing time and the y-axis representing the variable’s values.

Line Chart (Image by Author)

“In today’s episode, we’ve discovered the power of descriptive analytics and the revealing nature of data visualization techniques. These tools have allowed us to identify historical patterns and trends in Zara’s coffee shop sales, providing her with a solid foundation for strategic decision-making. All of the code and raw files are available in my GitHub repository [Link]

However, our journey into the realm of data science is just beginning. In our next episode, we’ll delve into inferential statistics, aiming to understand why these patterns occur and unearthing the hidden relationships in Zara’s data. See you next time on our Data Mastery Series!

Thank you for taking the time to read this article! If you found it enjoyable, we recommend checking out these other articles for your reading pleasure.

Data Science

33 stories

Dashboard

3 stories

Donato_Journey

5 stories

Course_Review

3 stories

Please feel free to contact me, I am willing to share and exchange on topics related to Data Science and Supply Chain.
Facebook:
facebook.com/nattapong.thanngam
Linkedin:
linkedin.com/in/nattapong-thanngam

--

--

Donato_TH
Donato Story

Data Science Team Lead at Data Cafe, Project Manager (PMP #3563199), Black Belt-Lean Six Sigma certificate