🤖 AI Summary
This study addresses node regression under noisy settings where both node covariates and graph structure are corrupted. Focusing on random geometric graphs perturbed by Erdős–Rényi noise edges and noisy node features, the work proposes a novel graph attention network that constructs denoised proxy features to enhance regression performance. It provides the first theoretical guarantees for graph attention mechanisms on noisy graphs, demonstrating superior performance over ordinary least squares (OLS) and graph convolutional networks (GCNs) in both regression coefficient estimation and prediction on unlabeled nodes. The analysis leverages high-dimensional geometric tail bounds, neighborhood counting arguments, and covariance concentration to establish lower asymptotic error for the proposed method. Empirical validation on synthetic and real-world graph data confirms its effectiveness.
📝 Abstract
Graph attention networks (GATs) are widely used and often appear robust to noise in node covariates and edges, yet rigorous statistical guarantees demonstrating a provable advantage of GATs over non-attention graph neural networks~(GNNs) are scarce. We partially address this gap for node regression with graph-based errors-in-variables models under simultaneous covariate and edge corruption: responses are generated from latent node-level covariates, but only noise-perturbed versions of the latent covariates are observed; and the sample graph is a random geometric graph created from the node covariates but contaminated by independent Erd\H{o}s--R\'enyi edges. We propose and analyze a carefully designed, task-specific GAT that constructs denoised proxy features for regression. We prove that regressing the response variables on the proxies achieves lower error asymptotically in (a) estimating the regression coefficient compared to the ordinary least squares (OLS) estimator on the noisy node covariates, and (b) predicting the response for an unlabelled node compared to a vanilla graph convolutional network~(GCN) -- under mild growth conditions. Our analysis leverages high-dimensional geometric tail bounds and concentration for neighbourhood counts and sample covariances. We verify our theoretical findings through experiments on synthetically generated data. We also perform experiments on real-world graphs and demonstrate the effectiveness of the attention mechanism in several node regression tasks.