Knowing Your Neighbours: Machine Learning on Graphs

Pantelis Elinas
stellargraph
Published in
9 min readJun 6, 2019

--

We live in a connected world and generate a vast amount of connected data. Social networks, financial transaction systems, biological networks, transportation systems and a telecommunication nexus are all examples. The paper citation network displayed in Figure 1 is another example of connected data.

Figure 1: Visualisation of a paper citation network. The nodes represent research papers, while the edges illustrate citations between papers, with the various colour indicative of a report’s subject, with seven colours coding seven topics.

Representing connected data is possible using a graph data structure regularly used in Computer Science.

In this article, we will provide an introduction to the assorted types of connected data, what they represent, and the challenges we can solve. We also introduce graph convolutional networks (GCNs). Using GCN as an example, this paper will also explain how modern machine learning methods can build predictive models of connected data.

Definitions and types of networks

A graph data structure has two basic elements: nodes and edges (see Figure 2 below). Nodes represent entities in the data such as members of an online social network, while edges symbolise relationships between those entities, such as friendship between members of a social network. This web of nodes and edges form a graph — a mathematical representation of the network structure of the data.

--

--

Pantelis Elinas
stellargraph

I am a senior machine learning research engineer. I enjoy working on interesting problems, sharing knowledge, and developing useful software tools.