🤖 AI Summary
Conventional covariance-based neural networks (VNNs) rely on precomputed dense covariance matrices for graph construction, failing to capture conditional independence among variables and decoupling graph structure learning from downstream tasks. Method: We propose Precision Neural Networks (PNNs), which perform graph convolution directly on the inverse covariance (precision) matrix. PNNs jointly optimize network parameters and a sparse precision matrix in an end-to-end manner, enabling task-aware graph structure learning. A spectral constraint ensures covariance consistency, and theoretical analysis quantifies precision estimation error. The sparsity of the learned precision matrix explicitly encodes conditional independence, enhancing model interpretability and generalization. Results: Experiments on synthetic and real-world datasets demonstrate that PNNs significantly outperform two-stage approaches (graph learning followed by separate model training) in both graph structure accuracy and downstream task performance.
📝 Abstract
CoVariance Neural Networks (VNNs) perform convolutions on the graph determined by the covariance matrix of the data, which enables expressive and stable covariance-based learning. However, covariance matrices are typically dense, fail to encode conditional independence, and are often precomputed in a task-agnostic way, which may hinder performance. To overcome these limitations, we study Precision Neural Networks (PNNs), i.e., VNNs on the precision matrix -- the inverse covariance. The precision matrix naturally encodes statistical independence, often exhibits sparsity, and preserves the covariance spectral structure. To make precision estimation task-aware, we formulate an optimization problem that jointly learns the network parameters and the precision matrix, and solve it via alternating optimization, by sequentially updating the network weights and the precision estimate. We theoretically bound the distance between the estimated and true precision matrices at each iteration, and demonstrate the effectiveness of joint estimation compared to two-step approaches on synthetic and real-world data.