Sankey diagrams now have the new Python package they deserved

Zlatan B
2 min readNov 17, 2023

From a pandas dataframe to a Sankey diagram in one line of Python

In a previous story, I argued why Sankey diagrams are so great for understanding energy data in five points. In this article, I am introducing a Python package dedicated to Sankey diagrams.

Why did Sankey diagrams deserve a Python package?

Well, because they are great to make (energy-related and other kinds of) data understandable and beautiful, and because the previously available ways to draw Sankey diagrams in Python had shortcomings (see below).

How do I use this new Sankey package?

This is really easy:

  • Install the Sugikey package: pip install sugikey
  • Get your data in a dataframe format with columns source, target and value, one row per flow.
  • Use the high-level function sankey.sankey_from_df to draw a Sankey diagram in one line of Python, as follows:
import pandas as pd
from sugikey import sankey

# Sankey diagram from a pandas dataframe
flow_df = pd.read_csv(csv_path)
sankey.sankey_from_df(flow_df)

This will get you your Sankey diagram plotted with Matplotlib, which may look like that:

An example Sankey diagram as Sugikey allows you to plot with one line of code.

What more can the package do?

  • Personalize your Sankey diagrams in terms of layout, colors etc.
  • Draw Sankey diagrams from NetworkX directed graphs, which is actually how the data is processed and represented by the package internally.
  • Draw Sankey diagrams not only with Matplotlib as a plotting library, but also using Bokeh for more interactive diagrams including tooltips.

Have a look at the documentation for more information.

How could you draw Sankey diagrams in Python before Sugikey?

  • With Plotly, as explained here, which is indeed a nice way to plot interactive Sankey diagrams, but does not allow diagrams to be saved in vector format.
  • With Matplotlib, as explained here, but this uses a much less intuitive syntax than Sugikey and is actually quite limited in terms of the kinds of flow structures you can represent (basically assuming one main trunk with incoming and outcoming flows).
  • With FloWeaver, which is quite cool but also a bit more involved and needs you to specify a lot more than just a list of flows: be prepared to think about “ProcessGroups”, “Waypoints”, “Partitions”, and “Bundles”… or just use Sugikey.

--

--