A Ribeiro | TU Delft Repository

EdgeNets

Edge Varying Graph Neural Networks

Journal article (2022) - Elvin Isufi, Fernando Gama, Alejandro Ribeiro

Driven by the outstanding performance of neural networks in the structured euclidean domain, recent years have seen a surge of interest in developing neural networks for graphs and data supported on graphs. The graph is leveraged at each layer of the neural network as a parameterization to capture detail at the node level with a reduced number of parameters and computational complexity. Following this rationale, this paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet. An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors. By extrapolating this strategy to more iterations between neighboring nodes, the EdgeNet learns edge- and neighbor-dependent weights to capture local detail. This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs). In writing different GNN architectures with a common language, EdgeNets highlight specific architecture advantages and limitations, while providing guidelines to improve their capacity without compromising their local implementation. For instance, we show that GCNNs have a parameter sharing structure that induces permutation equivariance. This can be an advantage or a limitation, depending on the application. In cases where it is a limitation, we propose hybrid approaches and provide insights to develop several other solutions that promote parameter sharing without enforcing permutation equivariance. Another interesting conclusion is the unification of GCNNs and GATs - approaches that have been so far perceived as separate. In particular, we show that GATs are GCNNs on a graph that is learned from the features. This particularization opens the doors to develop alternative attention mechanisms for improving discriminatory power. ...

Driven by the outstanding performance of neural networks in the structured euclidean domain, recent years have seen a surge of interest in developing neural networks for graphs and data supported on graphs. The graph is leveraged at each layer of the neural network as a parameterization to capture detail at the node level with a reduced number of parameters and computational complexity. Following this rationale, this paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet. An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors. By extrapolating this strategy to more iterations between neighboring nodes, the EdgeNet learns edge- and neighbor-dependent weights to capture local detail. This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs). In writing different GNN architectures with a common language, EdgeNets highlight specific architecture advantages and limitations, while providing guidelines to improve their capacity without compromising their local implementation. For instance, we show that GCNNs have a parameter sharing structure that induces permutation equivariance. This can be an advantage or a limitation, depending on the application. In cases where it is a limitation, we propose hybrid approaches and provide insights to develop several other solutions that promote parameter sharing without enforcing permutation equivariance. Another interesting conclusion is the unification of GCNNs and GATs - approaches that have been so far perceived as separate. In particular, we show that GATs are GCNNs on a graph that is learned from the features. This particularization opens the doors to develop alternative attention mechanisms for improving discriminatory power.

Variance-Constrained Learning for Stochastic Graph Neural Networks

Conference paper (2021) - Zhan Gao, Elvin Isufi, Alejandro Ribeiro

Stochastic graph neural networks (SGNNs) are information processing architectures that can learn representations from data over random graphs. SGNNs are trained with respect to the expected performance, but this training comes with no guarantee about the deviation of particular output realizations around the optimal mean. To overcome this issue, we propose a learning strategy for SGNNs based on a variance constrained optimization problem, balancing the expected performance and the stochastic deviation. To handle the variance constraint in the stochastic optimization problem, training is undertaken in the dual domain. We propose an alternating primal-dual learning algorithm that updates the primal variable (SGNN parameters) with gradient descent and the dual variable with gradient ascent. We show the stochastic deviation is explicitly controlled through Chebyshev inequality and analyze the optimality loss induced by the primal-dual learning. Through numerical simulations, we observe a strong performance in expectation with a controllable deviation corroborating the theoretical findings. ...

The Dual Graph Shift Operator

Identifying the Support of the Frequency Domain

Journal article (2021) - Geert Leus, Santiago Segarra, Alejandro Ribeiro, Antonio G. Marques

Contemporary data is often supported by an irregular structure, which can be conveniently captured by a graph. Accounting for this graph support is crucial to analyze the data, leading to an area known as graph signal processing (GSP). The two most important tools in GSP are the graph shift operator (GSO), which is a sparse matrix accounting for the topology of the graph, and the graph Fourier transform (GFT), which maps graph signals into a frequency domain spanned by a number of graph-related Fourier-like basis vectors. This alternative representation of a graph signal is denominated the graph frequency signal. Several attempts have been undertaken in order to interpret the support of this graph frequency signal, but they all resulted in a one-dimensional interpretation. However, if the support of the original signal is captured by a graph, why would the graph frequency signal have a simple one-dimensional support? Departing from existing work, we propose an irregular support for the graph frequency signal, which we coin dual graph. A dual GSO leads to a better interpretation of the graph frequency signal and its domain, helps to understand how the different graph frequencies are related and clustered, enables the development of better graph filters and filter banks, and facilitates the generalization of classical SP results to the graph domain. ...

Stability of graph convolutional neural networks to stochastic perturbations

Journal article (2021) - Zhan Gao, Elvin Isufi, Alejandro Ribeiro

Graph convolutional neural networks (GCNNs) are nonlinear processing tools to learn representations from network data. A key property of GCNNs is their stability to graph perturbations. Current analysis considers deterministic perturbations but fails to provide relevant insights when topological changes are random. This paper investigates the stability of GCNNs to stochastic graph perturbations induced by link losses. In particular, it proves the expected output difference between the GCNN over random perturbed graphs and the GCNN over the nominal graph is upper bounded by a factor that is linear in the link loss probability. We perform the stability analysis in the graph spectral domain such that the result holds uniformly for any graph. This result also shows the role of the nonlinearity and the architecture width and depth, and allows identifying handle to improve the GCNN robustness. Numerical simulations on source localization and robot swarm control corroborate our theoretical findings. ...

Stochastic graph neural networks

Journal article (2021) - Zhan Gao, Elvin Isufi, Alejandro Ribeiro

Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning among others. Current GNN architectures assume ideal scenarios and ignore link fluctuations that occur due to environment, human factors, or external attacks. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly. To overcome this issue, we put forth the stochastic graph neural network (SGNN) model: a GNN where the distributed graph convolution module accounts for the random network changes. Since stochasticity brings in a new learning paradigm, we conduct a statistical analysis on the SGNN output variance to identify conditions the learned filters should satisfy for achieving robust transference to perturbed scenarios, ultimately revealing the explicit impact of random link losses. We further develop a stochastic gradient descent (SGD) based learning process for the SGNN and derive conditions on the learning rate under which this learning process converges to a stationary point. Numerical results corroborate our theoretical findings and compare the benefits of SGNN robust transference with a conventional GNN that ignores graph perturbations during learning. ...

Nonlinear State-Space Generalizations of Graph Convolutional Neural Networks

Conference paper (2021) - Luana Ruiz, Fernando Gama, Alejandro Ribeiro, Elvin Isufi

Graph convolutional neural networks (GCNNs) learn compositional representations from network data by nesting linear graph convolutions into nonlinearities. In this work, we approach GCNNs from a state-space perspective revealing that the graph convolutional module is a minimalistic linear state-space model, in which the state update matrix is the graph shift operator. We show that this state update may be problematic because it is nonparametric, and depending on the graph spectrum it may explode or vanish. Therefore, the GCNN has to trade its degrees of freedom between extracting features from data and handling these instabilities. To improve such trade-off, we propose a novel family of nodal aggregation rules that aggregate node features within a layer in a nonlinear state-space parametric fashion allowing for a better trade-off. We develop two architectures within this family inspired by the recurrence with and without nodal gating mechanisms. The proposed solutions generalize the GCNN and provide an additional handle to control the state update and learn from the data. Numerical results on source localization and authorship attribution show the superiority of the nonlinear state-space generalization models over the baseline GCNN. ...

Stochastic Graph Neural Networks

Conference paper (2020) - Zhan Gao, Elvin Isufi, Alejandro Ribeiro

Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning among others. However, current GNN implementations assume ideal distributed scenarios and ignore link fluctuations that occur due to environment or human factors. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly. To overcome this issue, we put forth the stochastic graph neural network (SGNN) model: a GNN where the distributed graph convolutional operator is modified to account for the network changes. Since stochasticity brings in a new paradigm, we develop a novel learning process for the SGNN and introduce the stochastic gradient descent (SGD) algorithm to estimate the parameters. We prove through the SGD that the SGNN learning process converges to a stationary point under mild Lipschitz assumptions. Numerical simulations corroborate the proposed theory and show an improved performance of the SGNN compared with the conventional GNN when operating over random time varying graphs. ...

Graph-Adaptive Activation Functions for Graph Neural Networks

Conference paper (2020) - Bianca Iancu, Luana Ruiz, Alejandro Ribeiro, Elvin Isufi

Activation functions are crucial in graph neural networks (GNNs) as they allow defining a nonlinear family of functions to capture the relationship between the input graph data and their representations. This paper proposes activation functions for GNNs that not only adapt to the graph into the nonlinearity, but are also distributable. To incorporate the feature-topology coupling into all GNN components, nodal features are nonlinearized and combined with a set of trainable parameters in a form akin to graph convolutions. The latter leads to a graph-adaptive trainable nonlinear component of the GNN that can be implemented directly or via kernel transformations, therefore, enriching the class of functions to represent the network data. Whether in the direct or kernel form, we show permutation equivariance is always preserved. We also prove the subclass of graph-adaptive max activation functions are Lipschitz stable to input perturbations. Numerical experiments with distributed source localization, finite-time consensus, distributed regression, and recommender systems corroborate our findings and show improved performance compared with pointwise as well as state-of-the-art localized nonlinearities. ...

Graphs, Convolutions, and Neural Networks

From Graph Filters to Graph Neural Networks

Review (2020) - Fernando Gama, Elvin Isufi, Geert Leus, Alejandro Ribeiro

Network data can be conveniently modeled as a graph signal, where data values are assigned to nodes of a graph that describes the underlying network topology. Successful learning from network data is built upon methods that effectively exploit this graph structure. In this article, we leverage graph signal processing (GSP) to characterize the representation space of graph neural networks (GNNs). We discuss the role of graph convolutional filters in GNNs and show that any architecture built with such filters has the fundamental properties of permutation equivariance and stability to changes in the topology. These two properties offer insight about the workings of GNNs and help explain their scalability and transferability properties, which, coupled with their local and distributed nature, make GNNs powerful tools for learning in physical networks. We also introduce GNN extensions using edge-varying and autoregressive moving average (ARMA) graph filters and discuss their properties. Finally, we study the use of GNNs in recommender systems and learning decentralized controllers for robot swarms. ...

Convolutional Neural Network Architectures for Signals Supported on Graphs

Journal article (2019) - Fernando Gama, Antonio G. Marques, Geert Leus, Alejandro Ribeiro

Two architectures that generalize convolutional neural networks (CNNs) for the processing of signals supported on graphs are introduced. We start with the selection graph neural network (GNN), which replaces linear time invariant filters with linear shift invariant graph filters to generate convolutional features and reinterprets pooling as a possibly nonlinear subsampling stage where nearby nodes pool their information in a set of preselected sample nodes. A key component of the architecture is to remember the position of sampled nodes to permit computation of convolutional features at deeper layers. The second architecture, dubbed aggregation GNN, diffuses the signal through the graph and stores the sequence of diffused components observed by a designated node. This procedure effectively aggregates all components into a stream of information having temporal structure to which the convolution and pooling stages of regular CNNs can be applied. A multinode version of aggregation GNNs is further introduced for operation in large-scale graphs. An important property of selection and aggregation GNNs is that they reduce to conventional CNNs when particularized to time signals reinterpreted as graph signals in a circulant graph. Comparative numerical analyses are performed in a source localization application over synthetic and real-world networks. Performance is also evaluated for an authorship attribution problem and text category classification. Multinode aggregation GNNs are consistently the best-performing GNN architecture. ...

Controllability of bandlimited graph processes over random time varying graphs

Journal article (2019) - Fernando Gama, Elvin Isufi, Alejandro Ribeiro, Geert Leus

Controllability of complex networks arises in many technological problems involving social, financial, road, communication, and smart grid networks. In many practical situations, the underlying topology might change randomly with time, due to link failures such as changing friendships, road blocks or sensor malfunctions. Thus, it leads to poorly controlled dynamics if randomness is not properly accounted for. We consider the problem of controlling the network state when the topology varies randomly with time. Our problem concerns target states that are bandlimited over the graph; these are states that have nonzero frequency content only on a specific graph frequency band. We thus leverage graph signal processing and exploit the bandlimited model to drive the network state from a fixed set of control nodes. When controlling the state from a few nodes, we observe that spurious, out-of-band frequency content is created. Therefore, we focus on controlling the network state over the desired frequency band, and then use a graph filter to get rid of the unwanted frequency content. To account for the topological randomness, we develop the concept of controllability in the mean, which consists of driving the expected network state towards the target state. A detailed mean squared error analysis is performed to quantify the statistical deviation between the final controlled state on a particular graph realization and the actual target state. Finally, we propose different control strategies and evaluate their effectiveness on synthetic network models and social networks. ...

Aggregation Graph Neural Networks

Conference paper (2019) - Fernando Gama, Antonio G. Marques, Alejandro Ribeiro, Geert Leus

Graph neural networks (GNNs) regularize classical neural networks by exploiting the underlying irregular structure supporting graph data, extending its application to broader data domains. The aggregation GNN presented here is a novel GNN that exploits the fact that the data collected at a single node by means of successive local exchanges with neighbors exhibits a regular structure. Thus, regular convolution and regular pooling yield an appropriately regularized GNN. To address some scalability issues that arise when collecting all the information at a single node, we propose a multi-node aggregation GNN that constructs regional features that are later aggregated into more global features and so on. We show superior performance in a source localization problem on synthetic graphs and on the authorship attribution problem. ...

Convolutional Graph Neural Networks

Conference paper (2019) - Fernando Gama, Antonio G. Marques, Geert Leus, Alejandro Ribeiro

Convolutional neural networks (CNNs) restrict the, otherwise arbitrary, linear operation of neural networks to be a convolution with a bank of learned filters. This makes them suitable for learning tasks based on data that exhibit the regular structure of time signals and images. The use of convolutions, however, makes them unsuitable for processing data that do not exhibit such a regular structure. Graph signal processing (GSP) has emerged as a powerful alternative to process signals whose irregular structure can be described by a graph. Central to GSP is the notion of graph convolutional filters which can be used to define convolutional graph neural networks (GNNs). In this paper, we show that the graph convolution can be interpreted as either a diffusion or aggregation operation. When combined with nonlinear processing, these different interpretations lead to different generalizations which we term selection and aggregation GNNs. The selection GNN relies on linear combinations of signal diffusions at different resolutions combined with node-wise non-linearities. The aggregation GNN relies on linear combinations of neighborhood averages of different depth. Instead of node-wise nonlinearities, the nonlinearity in aggregation GNNs is pointwise on the different aggregation levels. Both of these models particularize to regular CNNs when applied to time signals but are different when applied to arbitrary graphs. Numerical evaluations show different levels of performance for selection and aggregation GNNs. ...

CNN architectures for GRAPH data

Conference paper (2019) - Fernando Gama, Antonio G. Marques, Geert Leus, Alejandro Ribeiro

In this ongoing work, we describe several architectures that generalize convolutional neural networks (CNNs) to process signals supported on graphs. The general idea of the replace time invariant filters with graph filters to generate convolutional features and to replace pooling with sampling schemes for graph signals. The different architectures are compared and the key trade offs are identified. Numerical simulations with both synthetic and real-world data are used to illustrate the advantages of the proposed approaches. ...

MIMO Graph Filters for Convolutional Neural Networks

Conference paper (2018) - Fernando Gama, Antonio G. Marques, Alejandro Ribeiro, Geert Leus

Superior performance and ease of implementation have fostered the adoption of Convolutional Neural Networks (CNN s) for a wide array of inference and reconstruction tasks. CNNs implement three basic blocks: convolution, pooling and pointwise nonlinearity. Since the two first operations are well-defined only on regular-structured data such as audio or images, application of CNN s to contemporary datasets where the information is defined in irregular domains is challenging. This paper investigates CNNs architectures to operate on signals whose support can be modeled using a graph. Architectures that replace the regular convolution with a so-called linear shift-invariant graph filter have been recently proposed. This paper goes one step further and, under the framework of multiple-input multiple-output (MIMO) graph filters, imposes additional structure on the adopted graph filters, to obtain three new (more parsimonious) architectures. The proposed architectures result in a lower number of model parameters, reducing the computational complexity, facilitating the training, and mitigating the risk of overfitting. Simulations show that the proposed simpler architectures achieve similar performance as more complex models. ...

Convolutional Neural Networks via Node-Verying Graph Filters

Conference paper (2018) - Fernando Gama, Geert Leus, Antonio G. Marques, Alejandro Ribeiro

Convolutional neural networks (CNNs) are being applied to an increasing number of problems and fields due to their superior performance in classification and regression tasks. Since two of the key operations that CNNs implement are convolution and pooling, this type of networks is implicitly designed to act on data described by regular structures such as images. Motivated by the recent interest in processing signals defined in irregular domains, we advocate a CNN architecture that operates on signals supported on graphs. The proposed design replaces the classical convolution not with a node-invariant graph filter (GF), which is the natural generalization of convolution to graph domains, but with a node-varying GF. This filter extracts different local features without increasing the output dimension of each layer and, as a result, bypasses the need for a pooling stage while involving only local operations. A second contribution is to replace the node-varying GF with a hybrid node-varying GF, which is a new type of GF introduced in this paper. While the alternative architecture can still be run locally without requiring a pooling stage, the number of trainable parameters is smaller and can be rendered independent of the data dimension. Tests are run on a synthetic source localization problem and on the 20NEWS dataset. ...

Control of graph signals over random time-varying graphs

Conference paper (2018) - Fernando Gama, Elvin Isufi, Geert Leus, Alejandro Ribeiro

In this work, we jointly exploit tools from graph signal processing and control theory to drive a bandlimited graph signal that is being diffused on a random time-varying graph from a subset of nodes. As our main contribution, we rely only on the statistics of the graph to introduce the concept of controllability in the mean, and therefore drive the signal on the expected graph to a desired bandlimited state. A mean-square error (MSE) analysis is performed for two main tasks: i) to highlight the role played by the signal bandwidth and the control nodes to the deviation from the mean signal of a particular realization; and ii) to select the control nodes and design the control signal that minimize this MSE. Numerical results validate the introduced controllability in the mean framework and show its ability to cope with time-varying topologies. ...

Stationary Graph Processes and Spectral Estimation

Journal article (2017) - Antonio G. Marques, Santiago Segarra, Geert Leus, Alejandro Ribeiro

Stationarity is a cornerstone property that facilitates the analysis and processing of random signals in the time domain. Although time-varying signals are abundant in nature, in many practical scenarios the information of interest resides in more irregular graph domains. This lack of regularity hampers the generalization of the classical notion of stationarity to graph signals. The contribution in this paper is twofold. Firstly, we propose a definition of weak stationarity for random graph signals that takes into account the structure of the graph where the random process takes place, while inheriting many of the meaningful properties of the classical definition in the time domain. Provided that the topology of the graph can be described by a normal matrix, our definition models stationary graph processes as the output of a linear graph filter applied to a white input. We will show that this is equivalent to requiring the correlation matrix to be diagonalized by the graph Fourier transform. Secondly, we analyze the properties of the power spectral density and propose a number of methods to estimate it. We start with nonparametric approaches, including periodograms, window-based average periodograms, and filter banks. We then shift the focus to parametric approaches, discussing the estimation of moving-average (MA), autoregressive (AR) and ARMA processes. Finally, we illustrate the power spectral density estimation in synthetic and real-world graphs signals. ...

Stationary Graph Processes

Parametric Power Spectral Estimation

Conference paper (2017) - Santiago Segarra, Antonio G. Marques, Geert Leus, Alejandro Ribeiro

Advancing a holistic theory of networks and network processes requires the extension of existing results in the processing of time-varying signals to signals supported on graphs. This paper focuses on the definition of stationarity and power spectral density for random graph signals, generalizes the concepts of autoregressive and moving average random processes to the graph domain, and investigates their parametric spectral estimation. Theoretical and algorithmic results are complemented with numerical tests on synthetic and real-world graphs. ...

Decentralized Prediction-Correction Methods for Networked Time-Varying Convex Optimization

Journal article (2017) - Andrea Simonetto, Alec Koppel, Aryan Mokhtari, Geert Leus, Alejandro Ribeiro

We develop algorithms that find and track the optimal solution trajectory of time-varying convex optimization problems that consist of local and network-related objectives. The algorithms are derived from the prediction-correction methodology, which corresponds to a strategy where the time-varying problem is sampled at discrete time instances, and then, a sequence is generated via alternatively executing predictions on how the optimizers at the next time sample are changing and corrections on how they actually have changed. Prediction is based on how the optimality conditions evolve in time, while correction is based on a gradient or Newton method, leading to decentralized prediction-correction gradient and decentralized prediction-correction Newton. We extend these methods to cases where the knowledge on how the optimization programs are changing in time is only approximate and propose decentralized approximate prediction-correction gradient and decentralized approximate prediction-correction Newton. Convergence properties of all the proposed methods are studied and empirical performance is shown on an application of a resource allocation problem in a wireless network. We observe that the proposed methods outperform existing running algorithms by orders of magnitude. The numerical results showcase a tradeoff between convergence accuracy, sampling period, and network communications. ...