[Hyperquery Tutorial] — Data Visualization

Paolo Perrone
6 min readSep 11, 2023

--

This tutorial explores the wide range of visualization options available on Hyperquery.

Before diving in, take a quick look at the documentation to get familiar with the visualization environment. This will give you a solid foundation for the concepts we’ll be exploring.

Hyperquery offers six chart types — bars, lines, areas, scatterplots, heatmaps, and arcs (pies), plus some sick customization options powered by Vega-lite.

In data visualization, each chart type serves a distinct purpose:

  • Bars, scatterplots, and heatmaps shine for visualizing relationships between variables.
  • Lines and areas are recommended for displaying trends over time.
  • Pies excel at representing proportions within a whole.

In what follows, we’ll uncover the distinctive properties of each chart type and how to utilize them effectively.

If you’re looking for a more conceptual approach, check out How to Chart? where I share some of the most powerful data visualization frameworks and mental models I’ve picked up over the years.

The SQL code is included alongside each chart, so you can recreate and tweak all the visuals to your liking. Just follow along and duplicate it to your Hyperquery environment.

Now, without further ado, let’s dive in!

Bar Charts

visualize relationships

Bar charts use rectangular bars to visually represent categorical variables.

The length of each bar is directly proportional to the magnitude of the value it represents.

This makes bar charts ideal for:

  1. Tracking changes over time
  2. Showing rankings or top values
  3. Visualizing the frequency of count data
  4. Comparing subcategories within a category

1. Track changes over time

Bar charts are a great option when you need to track changes within a specific category over time.

For example, you can create a bar chart to track the sales of different products, the revenue generated by different departments, or the popularity of different music genres over different periods.

2. Show rankings or top values

Horizontal bar charts are incredibly useful to highlight the top performers across different categories.

Think about best-selling products by area, most viewed videos by channel, or highest-rated movies by genre.

The bars’ length visually represents the rank of each item, making it easy to identify top values.

3. Visualize frequency or count data

Bar charts are also useful to present count data frequency, like the number of songs by artist, the occurrence of customer complaints by type, or the distribution of car colors.

4. Compare subcategories within a category

Bar charts can also display the contribution of subcategories within a broader category, like the sales performance of different product models or the revenue distribution across different periods.

Indenical graph as the previous one, but with an unstacked X axis.

This chart delves deeper, displaying total sales distribution over time and within each category.

Line Charts

display trends

Line charts are the go-to choice for:

  • tracking trends over time
  • monitoring performance metrics
  • analyze seasonality and cyclic patterns

For example:

To analyze how the relationship between variables changed over time, you can layer multiple line charts together, like in the example below.

In the next chart, we combine sales data for physical, digital download, and streaming formats.

Then, we use the “group by — size” option to group the sales data based on their respective sizes.

This grouping creates a visual representation where the thickness of the lines corresponds to sales performance.

The thicker blue line in 1999 showcases the peak of physical sales, while the barely visible purple line during the early 2000s represents the nascent stage of digital streaming.

Scatterplots

visualize relationships

Scatterplots are handy for visualizing the relationship between two numerical variables.

In a scatterplot, each point represents a single observation, and its position on the x- and y-axes reflects the values of the two variables being compared.

These charts quickly reveal correlations, clusters, and outliers.

This plot displays the relationship between the duration of songs in minutes and the number of likes in millions.

The chart is grouped by volume of views, with larger data points indicating songs with more views.

This chart visualizes the relationship between Energy and musical Key for songs with more than 10 million views.

Data point size corresponds to the total views of each song, while color indicates the Tempo (beats per minute) of each song.

Areas

display trends

Similar to line charts, area charts are great for visualizing trends over time.

Unlike lines, area charts convey the magnitude of variables by filling the space between the line and the x-axis

Stacked area charts excel at emphasizing cumulative compositions over time, making them ideal for part-to-whole analysis.

Area charts can also effectively display cumulative values over time.

Pie (Arc)

representing proportions

Pie charts compare proportions within a whole, like the allocation of budget by different departments.

In a pie chart, each category is depicted as a slice of the circle, and the size of each slice corresponds to its proportion of the total.

or

Heatmaps

visualize relationships

Heatmaps use color gradients to depict the distribution of a variable across two dimensions.

Each cell in the heatmap corresponds to a range of values for the two dimensions, with the color’s intensity reflecting the strength of the relationship.

This makes heatmaps ideal for visualizing patterns, correlations, and distributions within large datasets.

In the next heatmap, we visualize the relationship between danceability, energy, and the total number of views.

Each cell represents the total views for a particular combination of danceability and energy ranges.

Darker shades indicate higher total views, while lighter shades indicate lower total views.

This chart visualizes the relationship between tempo, energy, and the cumulative count of songs.

Each square represents a combination of tempo and energy ranges, shading is used to indicate the cumulative count.

Lastly, consider this scatterplot variation where the size of each dot corresponds to the song count for each combination of likes by musical Key.

And there you have it — a whirlwind tour of the amazing data visualization options in Hyperquery!

I hope this guide sparked some ideas for tailoring charts to reveal insights.

Now, I’m eager to hear from you:

  • What was your biggest “aha” moment from this post?
  • Did a particular chart type suddenly make sense in a new way?
  • Are there any other chart types that still baffle you or specific scenarios you’d like to explore in future posts?

The journey to become a master data storyteller it’s an ongoing adventure!

Thanks for following along!

Onwards! 🚀

--

--

Paolo Perrone
Paolo Perrone

Written by Paolo Perrone

Making the most of my passion for data and writing 🤖✒️ 20k+ followers on Linkedin https://www.linkedin.com/in/paoloperrone/