Design, Deception, and Iteration of Visualizing Emission Trends

7 min readFeb 17, 2024

By: Lyla Fruehstorfer, Lilly Nguyen, Michael Drozd, David Lapaglia

In the realm of environmental science and policy-making, the effective communication of greenhouse gas (GHG) emissions data is essential for making informed decisions and generating public awareness. Over the past weeks, our group explored a dataset comprising of GHG emissions over time and across various countries.

Our goal was to create clear, accurate, and informative visualizations of pollution data over time. To achieve this, we focused on design objectives like ensuring easy comparison of emission levels, highlighting key trends across countries and time periods, and using interactive elements to allow viewers to explore the data themselves. We researched design principles to guide our design decisions, aiming to create two informative visualizations and two deceptive ones.

In this design document, we outline our approach to cleaning, analyzing, and visualizing this data.

Ideation

After pulling the data and incorporating it into our Google Colab environment, we each took some time to sketch out visualizations that we thought would be compelling and visually appealing. We drew rough sketches of bubble graphs, streamgraphs, area charts, and so much more. This gave us the opportunity to dive deeper into the data that’s available and see what sorts of relationships we could highlight in a visualization.

Initial sketches exploring different relationships between pollutant levels over time

In one of our in-class meetings, we shared our sketches and also began to discuss what methods for deception we could incorporate into our graphs. Here are some of the ideas we had:

Ideas for manipulating the data:

Incorporate dual y-axes to display the top and bottom 3–5 countries in pollutants
Utilize the SVG to manipulate the starting point of the y-axis
Reorder dates to disrupt the sequential flow of a line graph or bar chart
Make use of color bias to evoke a sense of gloom or negativity when presenting the data

Decisions after reviewing sketches:

Have one or multiple visualizations displaying country and pollutant data
Show how each pollutant amount has changed over time
Use color to show different categorical data such as countries or pollutant type
Aim for informative visualizations and then find ways to manipulate them later

Prototyping

During the prototyping phase of our design process, we selected a few key visualizations that we believed would effectively convey interesting trends and patterns in the data. Our focus centered on highlighting trends such as the rising GHG emissions over time and identifying the major contributors to these pollutants. Through an iterative process, we experimented with potential color schemes and explored different markings/encodings including bars, circles, and lines.

Refined sketches with various markings (bars, lines, circles) and channels (hue, area, position)

The next step was to practice creating visualization with Altair. The goal in mind that we had in this state was to create as many as possible and value quantity over quality. As novice Altair programmers, we collectively ran into some pesky challenges with the dataset and Altair.

Issues with Altair:

Altair size limitations which only allow a certain amount of rows
Debugging errors with data imports
Too much data to parse
Unexplained negative emission levels
Cluttered and unsorted visualizations

Initial visualization with Altair without any data cleaning/filtering

Visually, there were too many countries to show in one graph. The pollutant levels for each country also varied a lot in magnitude. We even found that some countries had negative pollutant values through this process.

Solutions:

Spend time cleaning the data by deleting unnecessary attributes
Creating multiple data frames with filtered data (e.g. CO2 only dataframe)
Aggregate the different pollutant types into sums across years
Specify data types
Use StackOverflow, ChatGPT, and Altair documentation to debug

User Testing

Written & visual evidence of testing our prototypes with real people (user testing) and iteration of our design based on that feedback.

With the streamgraph, there was potential for a good visualization, and after working with the data we pivoted to using more filtered data and trying to show the different types of pollutants and not focusing on the country depiction. That made a much better graph.
Evan: “Visualization takes a lot of time to understand and digest. Really cool to look at but weird for comparisons. Think about it because not all the graphs need all the colors; mess with in PowerPoint. Visually compelling but is it effective? The way that the bumps on both sides are not symmetrical. Presents miscommunication.

We wanted to make the below chart more user friendly, after Anne looked at this chart, she felt that there was too hard to differentiate what was “second place”
From there, we wanted to split up the chart to make two smaller charts. Originally we did A-M and L-Z, but then Evan said it could be more effective to sort based on values and not alphabetical order. Anne also liked the descending order that could more easily show which countries were in second place and onward.
Evan: “Instead of splitting alphabetically, consider sorting by values. The comparison of the years is effective, but try playing with the ratios to enhance clarity. To facilitate comparison, aim for a more compact layout. Even if it appears squeezed, it will still provide a sense of comparison. If it becomes overwhelming, focus on a few key countries.”

Progression of chart adjustments following User Testing

Final Informative and Manipulated Visualizations

Informative line graph depicting pollutants over time

Modified the marking to be curved lines instead of rigid straight lines to showcase trends over time
Separated pollutants into different axes to avoid over-cluttering
Used hue as an identity channel to represent different pollutant categories

Informative bar chart comparing emissions from top countries in 1990 vs 2020

Sorted based on values instead of countries based on Evan’s feedback
Emphasized select key countries to prevent overcrowding of data
Implemented a more condensed layout to emphasize critical comparisons
Facilitated side-by-side comparison of graphs to enhance visual clarity

Manipulated Data with a logarithmic scale

This dual y-axis line chart’s goal was to show how data can easily confuse people and be manipulated due to the differing value scales.
Not knowing which data belongs to which y-axis can muddy up the data and make it appear to some that carbon dioxide drops from 30 million to 2 million.
A logarithmic scale was used on the right-hand side to manipulate the scale of lower pollutant data so it seems like the pollutant levels were rising faster than they are.
One change that was made due to feedback was changing the title of the logarithmic axis from “Value” to “Log Value” to avoid overt lying which is more malicious than deception.

Manipulated streamgraph of pollutants over time

The varying scales on the y-axis across these charts increase the likelihood of failure in accurately finding differences.
Certain pollutants may exhibit manipulated shapes to mirror those of previously displayed pollutants (e.g., Hydrofluorocarbons atop Methane).
Having half the graph flipped horizontally misrepresents data, causing positive values to appear as negative spikes.

Reflection

Drawing from feedback provided by classmates on our final visualizations, we gained insights into both the strengths and weaknesses of our designs. This understanding helps us make corrections and refine our design choices. This feedback will improve the quality of our visualizations in the future.

Strengths

Tooltips on each chart allow users to quickly understand the breakdown of the data and provide an interactive experience.
More complicated visualizations are separated into three different charts to improve user experience.

Weaknesses

Our manipulative charts are difficult to read, reducing the effectiveness of manipulation.
Logarithmic lines are not clearly labeled so it is difficult to to interpret the data at points.

Improvements

Adding percentages to our country’s pollutant bar chart would improve our readability, allowing users to see a breakdown of each pollutant as a ratio quickly.
Not include logarithmic and non-logarithmic lines on one chart.