Graph Convolutional Neural Network

Ruocheng Guo
4 min readMar 5, 2019

--

GCN filters original features through their convolution with the graph Laplacian in the spectral domain (figure credit [3])

Topology is everywhere. Even there is no explicit one like a social network, there can be implicit ones like the graphs created in spectral clustering (e.g., KNN graph).

CNN can not handle Non-Euclidean Structure data: the number of neighbors of each node is not a constant, we cannot directly apply the convolutional kernel (with a fixed size) [2].

The existing solutions can be categorized into the following two classes: 1). spatial domain; 2). spectral domain.

Spectral Domain: Here we focus on the spectral domain, which builds the foundation of GCN. Spectral graph theory refers to the research area where the properties of a graph are studied through the eigenvectors and eigenvalues of its Laplacian Matrix.

Why the Laplacian Matrix? (1) the Laplacian Matrix is a symmetric matrix, thus, we can study its spectral properties by performing eigenvalue decomposition on it. (2) the Laplacian Matrix is pretty sparse, only diagonal elements and those entries corresponding to the 1-hop neighbors are non-zero. (3) We can connect the Laplacian Matrix to the Laplacian operator.

Eigenvalue decomposition requires a n by n squared matrix to have at least n linearly independent eigenvectors. Fortunately, the Laplacian Matrix is always semi-positive definite, which implies that (1) it is symmetric and therefore has exactly n linearly independent eigenvectors, (2) any of its eigenvalues is non-negative, (3) any pair of its eigenvectors is orthogonal to each other.

Why the spectral approach? How does a GCN layer work?

Why not spatial approach? It faces the challenge of matching local neighborhoods. There is no unique formal definition of translation on graphs from a spatial perspective.

However, a spectral approach provides a well-defined localization operator on graphs via convolutions with a Kronecker delta implemented in the spectral domain. The convolution theorem defines convolutions as linear operators that diagonalize in the Fourier basis (eigenvectors of the Laplacian operator). Although (1) a filter defined in the spectral domain is not naturally localized and (2) translations are costly due to O(n²) multiplication with the graph Fourier basis. Both limitations can be resolved by filter parameterization.

The graph convolution in the spectral domain is often defined as:

input feature vector filtered by the graph convolution operator

The most straightforward parameterization of the kernel is

Non-parametric filter

However, the non-parametric filter is not localized in space and has learning complexity of O(n). To learn localized filter, we utilize the property that

K-th order Laplacian is zero if the distance between node i and j is larger than K.

The SOTA parameterization of filters is based on the Chebyshev expansion of the (K-1)-th order polynomial filters as

(K-1)-th order Chebyshev expansion of polynomial filters

where

is a diagonal matrix of rescaled eigenvalues of the graph Laplacian.

We can rewrite the filtering operation as

where

denotes the rescaled graph Laplacian.

By signifying

We can compute the filtering operation based on

then the higher order ones can be computed by the rule of Chebyshev polynomial as:

and finally, we obtain

Therefore the time complexity is O(K|E|).

Pooling: requires meaningful neighborhoods, where similar nodes are clustered together. Graph clustering is an NP-hard problem, heuristics have to be used. The details can be found in the paper [1].

In a more recent paper [3], the authors show that when there exists an underlying network structure (e.g., Cora dataset), we do not need to worry about the graph construction part which looks quite complicated in [1]. They also show that first-order rescaled approximation (Eq. 8) outperforms other heuristics.

Reference:

[1] Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. “Convolutional neural networks on graphs with fast localized spectral filtering.” In Advances in neural information processing systems, pp. 3844–3852. 2016.

[2] How to understand GCN (in Chinese)?https://www.zhihu.com/question/54504471/answer/332657604

[3] Kipf, Thomas N., and Max Welling. “Semi-supervised classification with graph convolutional networks.” arXiv preprint arXiv:1609.02907 (2016).

--

--