1594478340

Graph Neural Network(GNN) is a type of neural network that can be directly applied to graph-structured data. My previous post gave a brief introduction on GNN. Readers may be directed to this post for more details.

Many research works have shown GNN’s power for understanding graphs, but the way how and why GNN works still remains a mystery for most people. Unlike CNN, where we can extract activation of each layer to visualize the decisions of the network, in GNN it is hard to get a meaningful explanation of what features the network has learnt. Why does GNN determine a node is class A instead of class B? Why does GNN determine a graph is a chemical or molecule? It seems like GNN sees some useful structural information and determines are made upon these observations. But now the problem is, what observations does GNN see?

GNNExplainer is introduced in this paper.

Briefly speaking, it is trying to build a network to learn what a GNN has learnt.

The main principle of GNNExplainer is by reducing redundant information in a graph which does not directly impact the decisions. To explain a graph, we want to know what are the crucial features or structures in the graph that affect the decisions of a neural network. If a feature is important, then the prediction should be altered largely by removing or replacing this feature with something else. On the other hand, if removing or altering a feature does not affect the prediction outcome, the feature is considered not essential and thus should not be included in the explanation for a graph.

The primary objective for GNNExplainer is to generate a minimal graph that explains the decision for a node or a graph. To achieve this goal, the problem can be defined as finding a subgraph in the computation graph, that minimizes the difference in the prediction scores using the whole computation graph and the minimal graph. In the paper, this process is formulated as maximizing the mutual information(MI) between the minimal graph Gs and the computation graph G:

Besides, there is a secondary objective: the graph needs to be minimal. Though it was also mentioned in the first objective, we need to have a method to formulate this objective as well. The paper addresses it by adding a loss for the number of edges. Therefore, the loss for GNNExplainer is literally the combination of prediction loss and edge size loss.

#graph #graph-neural-networks #graph-theory #pattern-recognition #machine-learning

1594478340

Graph Neural Network(GNN) is a type of neural network that can be directly applied to graph-structured data. My previous post gave a brief introduction on GNN. Readers may be directed to this post for more details.

Many research works have shown GNN’s power for understanding graphs, but the way how and why GNN works still remains a mystery for most people. Unlike CNN, where we can extract activation of each layer to visualize the decisions of the network, in GNN it is hard to get a meaningful explanation of what features the network has learnt. Why does GNN determine a node is class A instead of class B? Why does GNN determine a graph is a chemical or molecule? It seems like GNN sees some useful structural information and determines are made upon these observations. But now the problem is, what observations does GNN see?

GNNExplainer is introduced in this paper.

Briefly speaking, it is trying to build a network to learn what a GNN has learnt.

The main principle of GNNExplainer is by reducing redundant information in a graph which does not directly impact the decisions. To explain a graph, we want to know what are the crucial features or structures in the graph that affect the decisions of a neural network. If a feature is important, then the prediction should be altered largely by removing or replacing this feature with something else. On the other hand, if removing or altering a feature does not affect the prediction outcome, the feature is considered not essential and thus should not be included in the explanation for a graph.

The primary objective for GNNExplainer is to generate a minimal graph that explains the decision for a node or a graph. To achieve this goal, the problem can be defined as finding a subgraph in the computation graph, that minimizes the difference in the prediction scores using the whole computation graph and the minimal graph. In the paper, this process is formulated as maximizing the mutual information(MI) between the minimal graph Gs and the computation graph G:

Besides, there is a secondary objective: the graph needs to be minimal. Though it was also mentioned in the first objective, we need to have a method to formulate this objective as well. The paper addresses it by adding a loss for the number of edges. Therefore, the loss for GNNExplainer is literally the combination of prediction loss and edge size loss.

#graph #graph-neural-networks #graph-theory #pattern-recognition #machine-learning

1595691960

In this post, we’re gonna take a close look at one of the well-known *Graph neural networks* named *GCN.* First, we’ll get the intuition to see how it works, then we’ll go deeper into the maths behind it.

Many problems are graphs in true nature. In our world, we see many data are graphs, such as molecules, social networks, and paper citations networks.

Examples of graphs. (Picture from [1])

- Node classification: Predict a type of a given node
- Link prediction: Predict whether two nodes are linked
- Community detection: Identify densely linked clusters of nodes
- Network similarity: How similar are two (sub)networks

In the graph, we have node features (the data of nodes) and the structure of the graph (how nodes are connected).

For the former, we can easily get the data from each node. But when it comes to the structure, it is not trivial to extract useful information from it. For example, if 2 nodes are close to one another, should we treat them differently to other pairs? How about high and low degree nodes? In fact, each specific task can consume a lot of time and effort just for Feature Engineering, i.e., to distill the structure into our features.

Feature engineering on graphs. (Picture from [1])

It would be much better to somehow get both the node features and the structure as the input, and let the machine to figure out what information is useful by itself.

That’s why we need Graph Representation Learning.

We want the graph can learn the “feature engineering” by itself. (Picture from [1])

**Paper:** **Semi-supervised Classification with Graph Convolutional Networks****(2017) [3]**

**GCN** is a type of **convolutional neural network** that **can work directly on graphs** and take advantage of their structural information.

it solves the problem of classifying nodes (such as documents) in a graph (such as a citation network), where labels are only available for a small subset of nodes (semi-supervised learning).

Example of Semi-supervised learning on Graphs. Some nodes dont have labels (unknown nodes).

#graph-neural-networks #graph-convolution-network #deep-learning #neural-networks

1602838800

This post will summarize the paper SimGNN which aims for fast graph similarity computation. Graphs are structures that are used to link different entities that we call nodes using relationships called edges. Graphs exist everywhere from bonds between the atoms to friends on Facebook, all these scenarios can be represented as a graph. One of the fundamental graph problems includes finding similarity between graphs. The similarity between graphs can be defined using these metrics :

- Graph Edit Distance
- Maximum Common Subgraph

However, currently available algorithms that are used to calculate these metrics have high complexities and it is not yet possible to compute exact GED using these for graphs having more than 16 nodes.

Some ways to compute these metrics are :

- Pruning verification Framework
- Approximating the GED in fast and heuristic ways

SimGNN follows another approach to tackle this problem i.e turning similarity computation problem into a learning problem.

Before getting into how SimGNN works, we must know the requirements to be satisfied by this model. It includes :

**Representation Invariant**: Different representations of the same graph should give the same results.- **Inductive: **Should be able to predict results for unseen graphs.
**Learnable:**Must work on different similarity metrics like GED and MCS

**SimGNN Approach: **To achieve the above-stated requirements, SimGNN uses two strategies

- Design Learnable Embedding Function: This maps the graph into an embedding vector, which provides a global summary of a graph. Here, some nodes of importance are selected and used for embedding computation. (less time complexity)
- Pair-wise node comparison: The above embedding are too coarse, thus further compute the pairwise similarity scores between nodes from the two graphs, from which the histogram features are extracted and combined with the graph level information. (this is a time-consuming strategy)

#graph-edit-distance #machine-learning #graph-neural-networks #graph-convolution-network

1594474140

Graph Neural Networks (GNNs) are widely used today in diverse applications of social sciences, knowledge graphs, chemistry, physics, neuroscience, etc., and accordingly there has been a great surge of interest and growth in the number of papers in the literature.

However, it has been increasingly difficult to gauge the effectiveness of new models and validate new ideas that generalize universally to larger and complex datasets **in the absence of** a standard and widely-adopted **benchmark**.

**To address** this paramount concern existing in graph learning research, we develop an open-source, easy-to-use and reproducible benchmarking framework with a rigorous experimental protocol that is representative of the categorical advances in GNNs.

This post outlines the issuesintheGNNliterature suggesting the need of a benchmark, the framework proposed in thepaper, the broad classes of widely used and powerful GNNs benchmarked and the insights learnt from the extensive experiments._

In any core research or application area in deep learning, a benchmark helps to identify and quantify what types of architectures, principles, or mechanisms are universal and generalizable to real-world tasks and large datasets. Particularly, the recent revolution in this AI field is often credited, *to a possibly large extent*, to be triggered by the large-scale benchmark image dataset, ImageNet. (Obviously, other driving factors include increase in the volume of research, more datasets, compute, wide-adoptance, etc.)

Fig 1: ImageNet Classification Leaderboard from paperswithcode.com

Benchmarking has been proved to be beneficial for **driving progress**, identifying **essential ideas**, and solving domain-related problems in many sub-fields of science. This project was conceived with this fundamental motivation.

Many of the widely cited papers in the GNN literature contain experiments that are evaluated on **small graph datasets** which have only a few hundreds (or, thousand) of graphs.

Fig 2: Statistics of the widely used TU datasets. Source Errica et al., 2020

**Take for example**, the ENZYMES dataset, which is almost seen in every work on a GNN for classification task. If one uses a random 10-fold cross validation (in most papers), the test set would have 60 graphs (i.e. 10% of 600 total graphs). That would mean a correct classification (or, alternatively a misclassification) would change 1.67% of test accuracy score. **A couple of samples could determine a 3.33% difference in performance measure**, which is usually a significant gain score stated when one validates a new idea in literature. You see there, the number of samples is unreliable to concretely acknowledge the advances.¹

Our experiments, too, show that the standard deviation of performance on such datasets is large, making it difficult to make substantial conclusions on a research idea. Moreover, most GNNs perform statistically the same on these datasets. The **quality** of these datasets also leads one to question if you should use them while validating ideas on GNNs. On several of these datasets, simpler models, sometimes, perform as good, or even beats GNNs.

Consequently, **it has become difficult** to differentiate complex, simple and graph-agnostic architectures for graph machine learning.

Several papers in the GNN literature do not have consensus on a unifying and robust experimental setting which leads to discussing the inconsistencies and re-evaluating several papers’ experiments.

For a couple of examples to highlight here, Ying et al., 2018 performed training on 10-fold split data for a fixed number of epochs and reported the performance of the epoch which has the *“highest average validation accuracy across the splits at any epoch”* whereas Lee et al., 2019 used an *“early stopping criterion”* by monitoring the epoch-wise validation loss and report *“average test accuracy at last epoch”* over 10-fold split.

Now, if we extract results of both these papers to put together in the same table and claim that the model with the highest performance score is the promising of all, **can we get convinced** that the comparison is fair?

There are other issues related to hyperparamter selection, comparison in an unfair budgets of trainable parameters, use of different train-validation-test splits, etc.

The existence of such problems pushed us to develop a GNN benchmarking framework which **standardizes GNN research** and help researchers make more meaningful advances.

#benchmarking #graph #graph-deep-learning #graph-neural-networks #deep-learning

1604127900

This blog post will summarise the paper “ Simplifying Graph Convolutional Networks[1] ”, which tries to reverse engineer the Graph Convolutional Networks. So, let us evolve Graph Convolutional Networks backward.

Graphs are pervasive models of structures. They are everywhere, from social networks to the chemistry molecule. Various things can be represented in terms of graphs. However, applying Machine learning to these structures is something that didn’t come directly to us. Everything in Machine learning came from a small simple idea or model which was made complex with time as per the need. Just as an example, initially, we had Perceptron which evolved to Multi-Layer perception, similarly, we had image filters that evolved to non-linear CNNs, and so on. However, Graph Convolutional Networks, referred to as GCN, were something we derived directly from existing ideas and had a more complex start. Thus, to debunk the GCNs, the paper tries to reverse engineer the GCN and proposes a simplified linear model called **Simple Graph Convolution (SGC).** SGC as when applied gives comparable performance to GCNs and is faster than even the Fast-GCN.

Inputs to the Graph convolutional network are:

1. Node Labels

2. Adjacency matrix

**Adjacency matrix: **The adjacency matrix **A **is **n x n,**matrix where n is the number of nodes, with a(i,j) = 1 if node i is connected to node j else a(i,j) = 0. If edge is weighted then a(i,j) = edge weight.

**Diagonal Matrix: **Diagonal matrix **D **is n x n matrix with d(i,i) = sum of **i**th row of adjacency matrix.

**Input features: **X is an input feature matrix of size **n x c** with c as the number of classes.

Let us see how GCNs actually work before reverse engineering it.

#machine-learning #graph-convolution #graph-neural-networks #gcn #neural-networks