alph: tidy, legible visualisation of static graphs in Python

Create better-looking graph visualisations in Python notebooks and data apps with less friction

Published in

ConnectedCompany

7 min readMay 30, 2023

Clean and clear graph visualisations rarely materialise in a snap. Unless you’re a data viz rockstar, you’ll find yourself tinkering with many aspects of layouts, styling and graph structure to get the results you want.

This is especially true in the Python data ecosystem, where you might have to cobble together useful pieces of several libraries, and integrate with Javascript.

alph’s main goal is to reduce this friction, and so simplify the process of creating static graph visualisations in Python notebooks and data apps.

It recognises that “legible” static graphs are a result of experimentation with layouts, styling and graph structure. So instead of trying to nail “the perfect viz” in one shot, it focuses on making the key enablers of this experimentation-focused workflow work together effectively.

To achieve this, alph builds on well-established foundations from the Python data ecosystem:

NetworkX for graph definition —easy creation of nx.Graph objects and derivatives from Pandas dataframes and core Python data structures, and subsequent rich manipulation via dozens of graph algorithms
Altair for visual representation — a natural choice due to its effective balance between convenience, expressiveness and versatility — a balance that aligns well with the goals of alph
Standard “node positions” structure ({node_id: (x, y), …}) — a natural structure for the output of node layout algorithms, as used in NetworkX

alph started as a thin wrapper on top of this foundation. It kept being useful, and we kept adding to it. We decided to release it under the MIT licence in the hope that this benefits others with similar needs, and results in a richer, more versatile tool.

The name “alph” is just a concise combination of “altair” and “graph” — fitting enough for a straightforward, to-the-point tool.

At a glance

alph is a good fit for

creating static graphs in Python notebooks and data apps
reducing time needed to make bespoke visualisations — especially if you’re familiar with Altair
quickly experimenting with a variety of graph layouts
styling with the flexibility and consistency of Altair / Vega
reducing the amount of boilerplate code

… but not a good fit for

creating dynamic / animated graphs
handling custom user interactions
optimising for rendering performance
handling very large graphs effectively
getting amazing results in one shot

Getting to know alph

In the remainder of this post, we will cover

the basics — setup and plotting our first graph
Altair-based styling primitives
common layouts and how to explore which fits best
visualising large graphs
group nodes (sometimes also referred to as “combo” nodes)

All examples used here can also be found in the Alph repository under /examples.

First-time setup

To get started, set up a notebook-based Python environment of your choice — a new Python virtual environment, a Colab or JupyterLab notebook, or whatever else is quick and natural for you.

We’ll install alph with a couple of optional libraries, useful for graph layouts. One is graphviz, the other is our fork of the classic ForceAtlas2 layout.

Install graphviz for your platform —
sudo apt install libgraphviz-dev graphviz on Colab, Debian, Ubuntu…
brew install graphviz on a Mac
Install alph with graphviz support:
pip install alph[graphviz]
Install forceatlas2 from our fork, plus cython for speedup:
pip install cython \ "git+https://github.com/connectedcompany/forceatlas2.git"

That it for the setup part. Let’s create a simple graph to confirm everything is in place:

import networkx as nx
from alph import alph

G = nx.krackhardt_kite_graph()

chart = alph(G)
chart.configure_view(strokeWidth=0).properties(
  width=400, height=300
)

You should see a variation of the classic kite graph:

Basic concepts: graph, layout and styling

Next, let’s look at the basic elements of plotting an alph graph.

All plots are made by calling the alph function. This returns a standard Altair chart object that can be plotted, layered etc.

In a nutshell, the alph function accepts a networkx graph, a layout definition for the nodes, and Altair-formatted styling information for nodes and edges:

alph(
    G,
    layout_fn=...,
    node_args={
        ...
    },
    edge_args={
        ...
    },
)

Let’s see a real example, taken from the styling example notebook:

palette = [
    "#6CE6BA",
    "#DEC950",
    "#DE993A",
    "#DE4623",
    "#5B4CE0",
]

alph_params = dict(
    weight_attr="weight",
    layout_fn=lambda g: nx.spring_layout(
        g,
        weight="weight",
        k=8,
        iterations=5000,
        seed=seed
    ),
    node_args=dict(
        size=alt.Size(
            "degree_centrality",
            scale=alt.Scale(domain=[0,1], range=[12**2, 40**2]),
            legend=None
        ),
        fill=alt.Color(
            "company",
            scale=alt.Scale(domain=companies, range=palette),
        ),
        stroke="#333",
        strokeWidth=alt.Size(
            "degree_centrality",
            scale=alt.Scale(domain=[0,1], range=[2, 5]),
            legend=None
        ),
        tooltip_attrs=["name", "company"],
        label_attr="name",
    ),
    edge_args=dict(
        color="#000",
    ),
    width=800,
    height=600,
)

alph(G, **alph_params).configure_view(strokeWidth=0)

Given a simple interaction graph G, shown in the example notebook, this code produces the image at the top of this article:

A few things are worth highlighting:

A layout is provided by any function that returns a dictionary of node positions, like {node_id: array([x, y]), ...}. Here we’re using the spring layout that comes with networkx, we’ll see other options shortly.
Node and edge styling is provided by Altair chart arguments, with full API docs on the Github project page. Note how we can reference graph attributes directly.
the altair chart returned by alph can be manipulated in the usual way — here we’ve used it to remove the border

More layouts

Since layouts are simply dictionaries of node positions ({“node_id”: array([x, y]), …}, with x and y between -1 and +1) , it is easy to combine multiple layout providers, or simply position nodes based on pre-existing co-ordinates, such as geolocation or 2D embeddings.

The layouts gallery example notebook shows several popular graph layout options in action:

Additionally, the flight routes example notebook shows how we can use pre-existing layout information — in this case, geo coordinates — to position nodes. In that particular case, we simply map lat / long coordinates to the -1 to +1 range to create the required positions dictionary:

pos_long_lat = {
    k: (
        np.interp(item["longitude"], [-180,180], [-1, 1]),
        np.interp(item["latitude"], [-90,90], [-1,1]),
    )
    for k, item in airports.set_index("airport_id").to_dict("index").items()
}

This produces the following visual:

Large graphs

The size of graphs alph can render is determined by Altair.

Internally, alph turns graphs into a tabular dataset, with each row representing an edge.

Out of the box, Altair errors out if a dataset given to it exceeds 5,000 rows. As stated in the docs,

This is not because Altair cannot handle larger datasets, but it is because it is important for the user to think carefully about how large datasets are handled.

The Large Datasets section in the Altair docs provide helpful guidance on some large dataset tradeoffs, and options for handling them.

Our Flight Routes example shows one approach. Rather than embedding data directly in the chart, a JSON dataset is stored locally and read from the filesystem:

CHART_DATA_DIR="out"

def chart_data_dir(data, data_dir=CHART_DATA_DIR):
    os.makedirs(data_dir, exist_ok=True)
    return alt.to_json(data, filename=data_dir + "/{prefix}-{hash}.{extension}")

alt.data_transformers.register("chart_data_dir", chart_data_dir)
alt.data_transformers.enable("chart_data_dir")

An alternative to plotting large graphs unmodified, particularly if the result is an illegible hairball, is to filter or prune them.

Many approaches exist. The Flight Routes example demonstrates a couple.

Group nodes

Sometimes it is helpful to retain and reveal hierarchy implied by attributes of nodes in a graph. One way of doing this is to group related nodes, and show edges within and across groups. Some tools and libraries refer to these group-level nodes as “combo nodes”.

The Group Nodes example notebook shows how this can be applied to some clusters picked out from the Les Miserables dataset:

To plot this grouped layout, we need to add a few more arguments to the call to alph.

Here is a quick overview, with additional details + full example in the notebook:

alph(
  G,
  weight_attr="value",
  layout_fn=layout_fn,                   # layout to use inside group nodes
  node_args=...,                         # similarly, this is node style ...
  edge_args=...,                         # ...and edge style inside group nodes
  combo_group_by="group"                 # node attribute to group by
  combo_layout_fn=...,                   # layout for combo nodes
  combo_node_additional_attrs={
      ...                                # extra attrs for group-level nodes
                                         # can be used in combo node/edge args
  },
  combo_node_args=...,                   # group node style
  combo_edge_args...,                    # group edge style
  combo_size_scale_range=[40**2,100**2], # is the range of group node sizes
  combo_inner_graph_scale_factor=0.6,    # size relative to group node size
  combo_empty_attr_action="promote",     # how to handle nodes w/o group attr
  ...
)

It’s worth caveating grouped node plots a bit. Multi-level hierarchies are not currently supported. Also, more effort needs to be put into the tuning of positions and layouts than with non-grouped graphs.

Nevertheless, with a bit of persistence, good results will usually follow.

Putting alph to work

Now that you’ve seen how alph can help you create better graph visualisations, it’s time to put it to work on your own data and projects. As usual, the best way to learn more about the library is to get hands-on with it.

The examples we’ve just covered and the API docs have hopefully given you enough to make a fun and productive start. If you run into problems, please raise an issue.

Finally, we invite you to help make the library even better. All contributions are welcome and appreciated, be they features, feedback, bug reports and fixes, doc tweaks - or just simply using the library.

Visit the alph page on GitHub to get started, and join us on the journey to making graph visualisations in Python that bit more accessible.