Visualizing electricity grid emissions data in Python

Published in

Singularity

6 min readFeb 7, 2024

At Singularity, we develop tools to measure and analyze emission rates across the electrical grid. The amount of input data can be very large (the Eastern Interconnect, the transmission system that delivers electricity to customers from Maine to Florida to the Dakotas, has more than 50,000 connection points) and multidimensional, varying in time and space.

Over time we’ve developed a set of visualizations that help us quickly understand what’s happening in systems like this. They don’t tell us everything, but they let us get a big picture view, and help identify the places we need to explore further.

The tools

Before we can start visualizing electrical grid data, we need the right tools to structure it.

To work with electrical grid data, we use PyPSA, an open source power systems analysis library. PyPSA networks use Pandas dataframes to structure the data for each electrical grid component, which allows us to easily combine grid data from a PyPSA network with emissions data from other sources.

To format emissions data in this blog post, we’ll use Pandas. In reality, we’re often working with multiple types of data on the grid simultaneously, including fuel mix, CO2, and CO2 equivalent (a measure of the total warming potential of emitted CO2, CH4, and N2O), and we use xarray, a multidimensional extension of Pandas.

Finally, we’ll need visualization tools. We use PyPSA’s matplotlib-based network visualization functions to visualize spatial patterns on the transmission grid topology. For our other plot styles (like the choropleths and heatmaps we show in this blog post), we use Plotly.

In this blog post, we’ll be visualizing a model of the 2020 US grid from Breakthrough Energy. You can read more about their grid modeling work here.

The visualizations

1. Consumed emission rates on the transmission grid

We start by visualizing consumed emission rates — that’s the emission rate associated with the electricity that a load, like a lightbulb or a data center, is using. We have a PyPSA network net and consumed_emissionsdata in units of lbs CO2e/MWh with a time index and bus names as columns:

We define a custom colormap that ranges in shade (light to dark) and from green to red to communicate increasingly emissions-intensive electricity:

from matplotlib.colors import LinearSegmentedColormap
matplotlib_cmap = LinearSegmentedColormap.from_list(
   "carbon",
   ["#c2ffcd","#cdf0a5","#d8e07c","#e3d054","#edc02b","#c89421","#a26716","#7d3b0b","#570e00"],
)
matplotlib_cmap.set_bad("lightgrey")

Finally, we use PyPSA’s built in plot function to visualize the carbon intensity of each transmission system bus at one time, adjusting line_widths and bus_sizes as needed depending on the size of the network:

this_time = consumed_emissions.index[0]
max_color=2000

net.plot(
       bus_cmap=matplotlib_cmap,
       bus_colors=consumed_emissions.loc[this_time, net.buses.index].values,
       bus_norm=matplotlib.colors.Normalize(vmin=0, vmax=max_color, clip=False),
       line_colors="lightgrey",
       line_widths=0.5,
       bus_sizes=.005,
       title=this_time.strftime("%h %d %H:00 UTC")
   )

This gives us an image like this, showing each bus on the transmission grid:

If we’re interested in more than one timestamp, we can save images for multiple snapshots and animate them using the imageio library:

images = []
for f in image_file_names:
   images.append(imageio.imread(f))
imageio.mimsave("eastern_movie.gif", images, loop=0)

This shows us that during this day, there are high emission rates in the Midwest and lower emission rates in eastern and western regions of the grid. But there’s so much data here that it’s difficult to interpret overall trends, so our next step is to aggregate these results to policy-relevant spatial scales.

2. Summarizing spatial trends

Spatial averages of consumed emission rates can be a useful way to summarize trends. In the electricity world, there are spatial divisions with practical meaning: states have emission targets and regulations, and Balancing Authorities (BAs) control generation within their regions. However, it’s worth using caution when using spatial averages. A spatial average over a large region, like a state, will usually conceal more granular trends that would appear if you averaged over smaller regions, like counties.

To look at a month-long, BA-level average of the data we looked at above, we first calculate BA-wide consumed emission rates:

Next, we use geopandas to read in a shapefile describing BA boundaries and merge in the emission rate data:

bounds = gpd.GeoDataFrame.from_file("data/gis/Control__Areas/Control__Areas.shp")

bounds = bounds.merge(
   ba_emission_rates,
   how='left',
   left_on="NAME",
   right_index=True)

# Remove any BAs outside of the eastern interconnect where we don't have data
bounds = bounds[~bounds["mean"].isna()]

bounds = bounds.set_index("NAME")

# Convert to lat/long coords for plotting
bounds = bounds.to_crs("WGS84")

Finally, we can use Plotly’s built-in choropleth function to color each shape by its mean emission rate. Plotly can take a custom colorscale from a list of colors, and we use the same custom colorscale we used in the transmission system visualization. We add some custom styling to adjust the projection and add state and country boundaries, which make the map more interpretable.

fig = px.choropleth(
   bounds,
   geojson=bounds.geometry,
   locations=bounds.index,
   color='mean',
   labels={'mean':'Consumed<br>emission rate'},
   range_color=(0, 2000),
   color_continuous_scale=["#c2ffcd","#cdf0a5","#d8e07c","#e3d054","#edc02b","#c89421","#a26716","#7d3b0b","#570e00"],
   width=700,
   height=450,
   title="August 2020"
  
)


fig.update_layout(
       geo=dict(
           scope='north america',
           showland = True,
           landcolor = "rgb(250, 250, 250)",
           subunitcolor = "rgb(217, 217, 217)",
           countrycolor = "rgb(217, 217, 217)",
           countrywidth = 0.5,
           subunitwidth = 0.5,
           resolution=50,
           showsubunits=True,
           projection_scale=2,
       )
   )




fig.update_geos(fitbounds="locations")

The resulting image, below, shows that emission rate averages vary much less than the nodal rates we visualized above (the colorscales are the same in both). We can also see some trends in the averages that were less obvious in the highly granular data: the Midwest consumes on average higher emission power, while the Northeast has relatively low emission power. There’s also a pocket of particularly high emission power in South Carolina which might be worth further exploration.

Average consumed emission rate for each eastern interconnect balancing authority in August of a modeled year.

3. Visualizing temporal trends

So far, we’ve seen a granular visualization of emission rates at each transmission system bus at each hour, and a choropleth that captures average spatial trends. But what about easily viewing trends in time? For this, we use day-hour heatmaps.

First, we calculate consumed emission rates at some spatial scale (here, across New York):

Next, we reshape the data so it’s indexed by hour (in Eastern time) and date:

emission_rates["time_EST"] = emission_rates.time.dt.tz_localize("UTC").dt.tz_convert("US/Eastern")
emission_rates["Hour (Eastern time)"] = emission_rates.time_EST.dt.hour
emission_rates["Date"] = emission_rates.time_EST.dt.date
emission_rates = emission_rates.drop_duplicates(subset=["Hour (Eastern time)","Date"], keep="first") # Fall backward timezone change results in duplicated hour
emission_rates = emission_rates.pivot(columns="Date", index="Hour (Eastern time)", values="rate")

This results in data that looks like this:

We can then visualize the data array as an image:

fig = px.imshow(emission_rates, color_continuous_scale=carbon_colors,
        title=f"New York State", zmin=0, zmax=1300, width=800, height=400)


fig.update_xaxes(
   dtick="M1",
   tickformat="%b")


fig.update_layout(
   coloraxis_colorbar=dict(
       title="Consumed rate<br>(lbs CO2e/MWh)",
   ),
)

Which results in a plot like the one below. We can see a seasonal trend with higher emission electricity in the summer, and a daily trend with higher emission electricity in the evenings and mornings. The same precautions we mentioned when looking at the choropleth plot above are relevant here: the temporal trends we see at the New York-wide level might be different than the trends in a specific county. It’s always worth looking at temporal trends at a variety of spatial resolutions to get the full picture.

Diving deeper

The visualizations listed here are a good starting point, but they usually raise more questions than they answer. Once we’ve used these visualizations to get a broad picture of emission rates on the grid, we dive deeper into specific regions to answer our customer’s questions, and other tools can become useful. For example, bar and line charts can be useful to compare emission rates across regions, since they put emission rates on the y-axis instead of on the color axis. We also look at features like fuel mix and power flow to understand the drivers of emission rate trends.