Graph-Adaptive Activation Functions for Graph Neural Networks

More Info
expand_more

Abstract

Network data are essential in applications such as recommender systems, social networks, and sensor networks. A unique characteristic that these data encompass is the coupling between the data values and the underlying network structure on which these data are defined. Graph Neural Networks (GNNs) have been designed as tools to extend the benefits of deep learning approaches to network data. One crucial component of GNNs is the nonlinear component, also known as the activation function. The activation function allows capturing the nonlinear relationships present in the input data. However, in the current literature, the essential data-network topology coupling is ignored in the nonlinear component of the GNN. To address this limitation, we propose in this thesis a new family of activation functions for GNNs that account for the graph structure and capture the data-network topology coupling, while also allowing for a distributed implementation. Specifically, we propose an initial diffusion of the data over the graph, prior to the local nonlinearization operation. The nonlinearization is designed in a form akin to graph convolutions. The latter leads to a graph-adaptive trainable nonlinear component of the GNN that can be implemented directly or via kernel transformations, therefore, enriching the class of functions to represent the network data. Whether in the direct or kernel form, we show the permutation equivariance property is always preserved. This ensures the output of the GNN is independent of node labeling and that the GNN exploits graph symmetries to generalize to different graphs sharing similar symmetries. Numerical experiments with distributed source localization, finite-time consensus, and distributed regression demonstrate the applicability of the proposed graph-adaptive activation functions in distributed scenarios and show improved or comparable performance to pointwise as well as state-of-the-art localized nonlinearities. Our findings also suggest the benefits of the proposed activation functions in situations where the communication resources in the network are limited.