🤖 AI Summary
This work addresses the degradation of information capacity in multi-granularity embeddings caused by dimensional redundancy and spectral collapse within nested subspaces. To mitigate these issues, the authors propose a self-distillation framework that optimizes embedding geometry through isotropic subspace alignment and introduces a synergistic regularization mechanism combining Soft Collapse Regularization (SCR) and Spectral Isotropy Regularization (SIR). This approach effectively suppresses subspace redundancy while ensuring uniform distribution of low-dimensional prefixes on the hypersphere. Notably, it achieves highly discriminative, semantically dense, and dimensionally flexible representations within a self-distillation paradigm—outperforming existing baselines under high-compression settings while preserving strong information capacity and discriminability.
📝 Abstract
Although multi-scales representation learning enables elastic-dimension embeddings, nested subspaces often suffer from dimensional redundancy and spectral collapse. To address this, we introduce MIC, a framework that optimizes the geometric landscape of multi-granular embeddings through isotropic subspace alignment. MIC employs Soft Collapse Regularization (SCR) to mitigate redundancy between prefix and residual subspaces via cross-correlation penalties, alongside Spectral Isotropy Regularization (SIR) to ensure hyper-spherical uniformity in low-dimensional prefixes. By unifying these strategies through a self-distillation objective, MIC generates semantically dense representations that maintain high discriminative power. Our experiments demonstrate that MIC significantly outperforms standard baselines, particularly in high-compression scenarios where maintaining informational capacity is most critical.