How Does Preconditioning Guide Feature Learning in Deep Neural Networks?

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates how preconditioning influences feature learning and generalization in deep neural networks, focusing on the induced spectral bias and its alignment with the teacher model’s spectral structure. We propose a preconditioner defined as the $p$-th power of the input covariance matrix and establish a theoretical framework under a single-index teacher model. Our analysis demonstrates that preconditioning selectively amplifies or suppresses input feature components by modulating the spectrum of the Gram matrix. Crucially, when the induced spectral bias aligns with the teacher’s spectral structure, the model exhibits substantial improvements in noise robustness, out-of-distribution generalization, and knowledge transfer. We validate this mechanism through rigorous theoretical analysis and comprehensive feature-level evaluations across multiple metrics. The results reveal a controllable, alignment-driven paradigm for feature learning—offering both interpretability and practical utility for designing generalizable deep models.

Technology Category

Application Category

📝 Abstract
Preconditioning is widely used in machine learning to accelerate convergence on the empirical risk, yet its role on the expected risk remains underexplored. In this work, we investigate how preconditioning affects feature learning and generalization performance. We first show that the input information available to the model is conveyed solely through the Gram matrix defined by the preconditioner's metric, thereby inducing a controllable spectral bias on feature learning. Concretely, instantiating the preconditioner as the $p$-th power of the input covariance matrix and within a single-index teacher model, we prove that in generalization, the exponent $p$ and the alignment between the teacher and the input spectrum are crucial factors. We further investigate how the interplay between these factors influences feature learning from three complementary perspectives: (i) Robustness to noise, (ii) Out-of-distribution generalization, and (iii) Forward knowledge transfer. Our results indicate that the learned feature representations closely mirror the spectral bias introduced by the preconditioner -- favoring components that are emphasized and exhibiting reduced sensitivity to those that are suppressed. Crucially, we demonstrate that generalization is significantly enhanced when this spectral bias is aligned with that of the teacher.
Problem

Research questions and friction points this paper is trying to address.

Analyzing preconditioning's impact on feature learning generalization
Investigating spectral bias induced by preconditioner's metric on features
Examining alignment between preconditioner and teacher for generalization enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Preconditioner induces controllable spectral bias
Exponent p and teacher alignment affect generalization
Spectral bias alignment with teacher enhances performance
🔎 Similar Papers
No similar papers found.