Knowing Your Neighbours: Machine Learning on Graphs
We live in a connected world and generate a vast amount of connected data. Social networks, financial transaction systems, biological networks, transportation systems and a telecommunication nexus are all examples. The paper citation network displayed in Figure 1 is another example of connected data.
Representing connected data is possible using a graph data structure regularly used in Computer Science.
In this article, we will provide an introduction to the assorted types of connected data, what they represent, and the challenges we can solve. We also introduce graph convolutional networks (GCNs). Using GCN as an example, this paper will also explain how modern machine learning methods can build predictive models of connected data.
Definitions and types of networks
A graph data structure has two basic elements: nodes and edges (see Figure 2 below). Nodes represent entities in the data such as members of an online social network, while edges symbolise relationships between those entities, such as friendship between members of a social network. This web of nodes and edges form a graph — a mathematical representation of the network structure of the data.