Rank Is Not Capacity: Spectral Occupancy for Latent Graph Models

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Traditional graph representation learning treats the latent dimension as a fixed hyperparameter, which inadequately captures the model’s true capacity and suffers from non-identifiability due to rotational and scaling ambiguities in latent factors. This work proposes Spectra, a method that uses the spectral distribution of normalized positive-definite kernels as the fundamental analytical unit. By leveraging Shannon effective rank as a dynamic measure to quantify and control model capacity, Spectra reinterprets capacity not as an external hyperparameter but as an intrinsic property of the model itself. Through spectral prefix extraction, trace-normalized kernel matrices, and bisection-based optimization, Spectra uncovers performance–capacity trade-offs across diverse real-world networks, achieves competitive results against strong baselines in link prediction, and enables the generation of aligned low-dimensional views at varying capacities from a single trained model.

📝 Abstract

Graph representation learning has become a standard approach for analyzing networked data, with latent embeddings widely used for link prediction, community detection, and related tasks. Yet a basic design choice, the latent dimension, is still treated as a brittle hyperparameter, fixed before training and tuned by held-out performance. Learned factors are also identifiable only up to rotation and rescaling, so the nominal rank rarely coincides with the quantity that governs model behavior. We propose Spectral Prefix Extraction and Capacity-Targeted Representation Analysis (Spectra), which replaces rank as the unit of analysis with the spectrum of a learned positive semidefinite kernel, trace-normalized so that spectra are comparable across fits. The normalized eigenvalues form a distribution on the simplex, and their Shannon effective rank acts both as a summary of learned capacity and as a controllable training-time coordinate: a single scalar shapes this realized dimension during training, and bisection targets any desired value within the rank cap. To theoretically support that, we show local regularity and monotonicity of the realized-dimension profile. Across collaboration, social, biological, and infrastructure networks, Spectra traces performance--capacity frontiers that make the trade-off between predictive accuracy and realized dimension visible. It performs competitively with strong link-prediction baselines, yields aligned lower-capacity views of the same fitted model through spectral prefixes, and provides a principled handle on capacity in the overparameterized regime. Capacity thus becomes a property of the fitted model rather than a hyperparameter of the training.

Problem

Research questions and friction points this paper is trying to address.

latent graph models

rank

model capacity

spectral occupancy

representation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

spectral occupancy

effective rank

latent graph models