Member-only story
Introduction to PyTorch BigGraph — with Examples
PyTorch BigGraph is a tool to create and handle large graph embeddings for machine learning. Currently there are two approaches in graph-based neural networks:
- Directly use the graph structure and feed it to a neural network. The graph structure is then preserved at every layer. graphCNNs use that approach, see for instance my post or this paper on that.
- But most graphs are too large for that. So it’s also reasonable to create a large embedding of the graph. And then use it as features in a traditional neural network.
PyTorch BigGraph handles the second approach, and we will do so as well below. Just for reference let’s talk about the size aspect for a second. Graphs are usually encoded by their adjacency matrix. If you have a graph with 3,000 nodes and an edge between each node, you end up with around 10,000,000 entries in your matrix. Even if that’s sparse, apparently this bursts most GPUs according to the paper linked above.
If you think about the usual graphs used in recommendation systems, you’ll realise they are typically much larger than that. Now there are already some excellent posts about the how and why of BigGraph, so I won’t spend more time on that. I’m interested in applying BigGraph to my machine learning problem and for that I like to take the simplest examples and…