1594478340
Graph Neural Network(GNN) is a type of neural network that can be directly applied to graph-structured data. My previous post gave a brief introduction on GNN. Readers may be directed to this post for more details.
Many research works have shown GNN’s power for understanding graphs, but the way how and why GNN works still remains a mystery for most people. Unlike CNN, where we can extract activation of each layer to visualize the decisions of the network, in GNN it is hard to get a meaningful explanation of what features the network has learnt. Why does GNN determine a node is class A instead of class B? Why does GNN determine a graph is a chemical or molecule? It seems like GNN sees some useful structural information and determines are made upon these observations. But now the problem is, what observations does GNN see?
GNNExplainer is introduced in this paper.
Briefly speaking, it is trying to build a network to learn what a GNN has learnt.
The main principle of GNNExplainer is by reducing redundant information in a graph which does not directly impact the decisions. To explain a graph, we want to know what are the crucial features or structures in the graph that affect the decisions of a neural network. If a feature is important, then the prediction should be altered largely by removing or replacing this feature with something else. On the other hand, if removing or altering a feature does not affect the prediction outcome, the feature is considered not essential and thus should not be included in the explanation for a graph.
The primary objective for GNNExplainer is to generate a minimal graph that explains the decision for a node or a graph. To achieve this goal, the problem can be defined as finding a subgraph in the computation graph, that minimizes the difference in the prediction scores using the whole computation graph and the minimal graph. In the paper, this process is formulated as maximizing the mutual information(MI) between the minimal graph Gs and the computation graph G:
Besides, there is a secondary objective: the graph needs to be minimal. Though it was also mentioned in the first objective, we need to have a method to formulate this objective as well. The paper addresses it by adding a loss for the number of edges. Therefore, the loss for GNNExplainer is literally the combination of prediction loss and edge size loss.
#graph #graph-neural-networks #graph-theory #pattern-recognition #machine-learning
1594478340
Graph Neural Network(GNN) is a type of neural network that can be directly applied to graph-structured data. My previous post gave a brief introduction on GNN. Readers may be directed to this post for more details.
Many research works have shown GNN’s power for understanding graphs, but the way how and why GNN works still remains a mystery for most people. Unlike CNN, where we can extract activation of each layer to visualize the decisions of the network, in GNN it is hard to get a meaningful explanation of what features the network has learnt. Why does GNN determine a node is class A instead of class B? Why does GNN determine a graph is a chemical or molecule? It seems like GNN sees some useful structural information and determines are made upon these observations. But now the problem is, what observations does GNN see?
GNNExplainer is introduced in this paper.
Briefly speaking, it is trying to build a network to learn what a GNN has learnt.
The main principle of GNNExplainer is by reducing redundant information in a graph which does not directly impact the decisions. To explain a graph, we want to know what are the crucial features or structures in the graph that affect the decisions of a neural network. If a feature is important, then the prediction should be altered largely by removing or replacing this feature with something else. On the other hand, if removing or altering a feature does not affect the prediction outcome, the feature is considered not essential and thus should not be included in the explanation for a graph.
The primary objective for GNNExplainer is to generate a minimal graph that explains the decision for a node or a graph. To achieve this goal, the problem can be defined as finding a subgraph in the computation graph, that minimizes the difference in the prediction scores using the whole computation graph and the minimal graph. In the paper, this process is formulated as maximizing the mutual information(MI) between the minimal graph Gs and the computation graph G:
Besides, there is a secondary objective: the graph needs to be minimal. Though it was also mentioned in the first objective, we need to have a method to formulate this objective as well. The paper addresses it by adding a loss for the number of edges. Therefore, the loss for GNNExplainer is literally the combination of prediction loss and edge size loss.
#graph #graph-neural-networks #graph-theory #pattern-recognition #machine-learning
1595691960
In this post, we’re gonna take a close look at one of the well-known Graph neural networks named GCN. First, we’ll get the intuition to see how it works, then we’ll go deeper into the maths behind it.
Many problems are graphs in true nature. In our world, we see many data are graphs, such as molecules, social networks, and paper citations networks.
Examples of graphs. (Picture from [1])
In the graph, we have node features (the data of nodes) and the structure of the graph (how nodes are connected).
For the former, we can easily get the data from each node. But when it comes to the structure, it is not trivial to extract useful information from it. For example, if 2 nodes are close to one another, should we treat them differently to other pairs? How about high and low degree nodes? In fact, each specific task can consume a lot of time and effort just for Feature Engineering, i.e., to distill the structure into our features.
Feature engineering on graphs. (Picture from [1])
It would be much better to somehow get both the node features and the structure as the input, and let the machine to figure out what information is useful by itself.
That’s why we need Graph Representation Learning.
We want the graph can learn the “feature engineering” by itself. (Picture from [1])
Paper: Semi-supervised Classification with Graph Convolutional Networks(2017) [3]
GCN is a type of convolutional neural network that can work directly on graphs and take advantage of their structural information.
it solves the problem of classifying nodes (such as documents) in a graph (such as a citation network), where labels are only available for a small subset of nodes (semi-supervised learning).
Example of Semi-supervised learning on Graphs. Some nodes dont have labels (unknown nodes).
#graph-neural-networks #graph-convolution-network #deep-learning #neural-networks
1602838800
This post will summarize the paper SimGNN which aims for fast graph similarity computation. Graphs are structures that are used to link different entities that we call nodes using relationships called edges. Graphs exist everywhere from bonds between the atoms to friends on Facebook, all these scenarios can be represented as a graph. One of the fundamental graph problems includes finding similarity between graphs. The similarity between graphs can be defined using these metrics :
However, currently available algorithms that are used to calculate these metrics have high complexities and it is not yet possible to compute exact GED using these for graphs having more than 16 nodes.
Some ways to compute these metrics are :
SimGNN follows another approach to tackle this problem i.e turning similarity computation problem into a learning problem.
Before getting into how SimGNN works, we must know the requirements to be satisfied by this model. It includes :
**SimGNN Approach: **To achieve the above-stated requirements, SimGNN uses two strategies
#graph-edit-distance #machine-learning #graph-neural-networks #graph-convolution-network
1594474140
Graph Neural Networks (GNNs) are widely used today in diverse applications of social sciences, knowledge graphs, chemistry, physics, neuroscience, etc., and accordingly there has been a great surge of interest and growth in the number of papers in the literature.
However, it has been increasingly difficult to gauge the effectiveness of new models and validate new ideas that generalize universally to larger and complex datasets in the absence of a standard and widely-adopted benchmark.
To address this paramount concern existing in graph learning research, we develop an open-source, easy-to-use and reproducible benchmarking framework with a rigorous experimental protocol that is representative of the categorical advances in GNNs.
This post outlines the issues in the GNN literature suggesting the need of a benchmark, the framework proposed in the paper, the broad classes of widely used and powerful GNNs benchmarked and the insights learnt from the extensive experiments._
In any core research or application area in deep learning, a benchmark helps to identify and quantify what types of architectures, principles, or mechanisms are universal and generalizable to real-world tasks and large datasets. Particularly, the recent revolution in this AI field is often credited, to a possibly large extent, to be triggered by the large-scale benchmark image dataset, ImageNet. (Obviously, other driving factors include increase in the volume of research, more datasets, compute, wide-adoptance, etc.)
Fig 1: ImageNet Classification Leaderboard from paperswithcode.com
Benchmarking has been proved to be beneficial for driving progress, identifying essential ideas, and solving domain-related problems in many sub-fields of science. This project was conceived with this fundamental motivation.
Many of the widely cited papers in the GNN literature contain experiments that are evaluated on small graph datasets which have only a few hundreds (or, thousand) of graphs.
Fig 2: Statistics of the widely used TU datasets. Source Errica et al., 2020
Take for example, the ENZYMES dataset, which is almost seen in every work on a GNN for classification task. If one uses a random 10-fold cross validation (in most papers), the test set would have 60 graphs (i.e. 10% of 600 total graphs). That would mean a correct classification (or, alternatively a misclassification) would change 1.67% of test accuracy score. A couple of samples could determine a 3.33% difference in performance measure, which is usually a significant gain score stated when one validates a new idea in literature. You see there, the number of samples is unreliable to concretely acknowledge the advances.¹
Our experiments, too, show that the standard deviation of performance on such datasets is large, making it difficult to make substantial conclusions on a research idea. Moreover, most GNNs perform statistically the same on these datasets. The quality of these datasets also leads one to question if you should use them while validating ideas on GNNs. On several of these datasets, simpler models, sometimes, perform as good, or even beats GNNs.
Consequently, it has become difficult to differentiate complex, simple and graph-agnostic architectures for graph machine learning.
Several papers in the GNN literature do not have consensus on a unifying and robust experimental setting which leads to discussing the inconsistencies and re-evaluating several papers’ experiments.
For a couple of examples to highlight here, Ying et al., 2018 performed training on 10-fold split data for a fixed number of epochs and reported the performance of the epoch which has the “highest average validation accuracy across the splits at any epoch” whereas Lee et al., 2019 used an “early stopping criterion” by monitoring the epoch-wise validation loss and report “average test accuracy at last epoch” over 10-fold split.
Now, if we extract results of both these papers to put together in the same table and claim that the model with the highest performance score is the promising of all, can we get convinced that the comparison is fair?
There are other issues related to hyperparamter selection, comparison in an unfair budgets of trainable parameters, use of different train-validation-test splits, etc.
The existence of such problems pushed us to develop a GNN benchmarking framework which standardizes GNN research and help researchers make more meaningful advances.
#benchmarking #graph #graph-deep-learning #graph-neural-networks #deep-learning
1604127900
This blog post will summarise the paper “ Simplifying Graph Convolutional Networks[1] ”, which tries to reverse engineer the Graph Convolutional Networks. So, let us evolve Graph Convolutional Networks backward.
Graphs are pervasive models of structures. They are everywhere, from social networks to the chemistry molecule. Various things can be represented in terms of graphs. However, applying Machine learning to these structures is something that didn’t come directly to us. Everything in Machine learning came from a small simple idea or model which was made complex with time as per the need. Just as an example, initially, we had Perceptron which evolved to Multi-Layer perception, similarly, we had image filters that evolved to non-linear CNNs, and so on. However, Graph Convolutional Networks, referred to as GCN, were something we derived directly from existing ideas and had a more complex start. Thus, to debunk the GCNs, the paper tries to reverse engineer the GCN and proposes a simplified linear model called Simple Graph Convolution (SGC). SGC as when applied gives comparable performance to GCNs and is faster than even the Fast-GCN.
Inputs to the Graph convolutional network are:
1. Node Labels
2. Adjacency matrix
**Adjacency matrix: **The adjacency matrix **A **is **n x n,**matrix where n is the number of nodes, with a(i,j) = 1 if node i is connected to node j else a(i,j) = 0. If edge is weighted then a(i,j) = edge weight.
**Diagonal Matrix: **Diagonal matrix **D **is n x n matrix with d(i,i) = sum of ith row of adjacency matrix.
**Input features: **X is an input feature matrix of size n x c with c as the number of classes.
Let us see how GCNs actually work before reverse engineering it.
#machine-learning #graph-convolution #graph-neural-networks #gcn #neural-networks