🤖 AI Summary
Existing representation learning methods—both contrastive and non-contrastive—can only linearly approximate the top eigenfunctions of the context kernel, failing to yield exact, ordered spectral decomposition; this limitation hinders faithful feature importance modeling and adaptive dimension selection. To address this, we propose the first general-purpose framework that explicitly computes identifiable and spectrally ordered eigenfunctions of the context kernel. Our approach unifies low-rank approximation and Rayleigh quotient optimization into modular, kernel-compatible, and scalable solver components. We validate the framework on synthetic kernels and real-world image data: the estimated eigenvalues serve as robust, interpretable feature scores, enabling efficient and principled feature selection. The method achieves a favorable trade-off between accuracy and computational efficiency, offering a novel paradigm for explainable, adaptive representation learning in large-scale settings.
📝 Abstract
Recent advances in representation learning reveal that widely used objectives, such as contrastive and non-contrastive, implicitly perform spectral decomposition of a contextual kernel, induced by the relationship between inputs and their contexts. Yet, these methods recover only the linear span of top eigenfunctions of the kernel, whereas exact spectral decomposition is essential for understanding feature ordering and importance. In this work, we propose a general framework to extract ordered and identifiable eigenfunctions, based on modular building blocks designed to satisfy key desiderata, including compatibility with the contextual kernel and scalability to modern settings. We then show how two main methodological paradigms, low-rank approximation and Rayleigh quotient optimization, align with this framework for eigenfunction extraction. Finally, we validate our approach on synthetic kernels and demonstrate on real-world image datasets that the recovered eigenvalues act as effective importance scores for feature selection, enabling principled efficiency-accuracy tradeoffs via adaptive-dimensional representations.