Towards a Learning Theory of Representation Alignment

📅 2025-02-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses the empirical phenomenon of representation alignment—where model representations increasingly converge as scale grows—in large language models, establishing the first formal learning-theoretic analysis framework. Methodologically, it unifies alignment definitions across metric, probabilistic, and spectral perspectives; introduces a task-driven representation stitching mechanism; and establishes its necessary and sufficient condition in terms of kernel alignment. Theoretical contributions include: (i) an upper bound on the generalization error of stitched representations; (ii) a rigorous proof that kernel alignment strictly governs representation transferability; and (iii) the first falsifiable learning-theoretic foundation for representation convergence in large models. Experimentally, the work integrates kernel methods, spectral graph theory, and probabilistic modeling to empirically validate the theoretical predictions.

Technology Category

Application Category

📝 Abstract

It has recently been argued that AI models' representations are becoming aligned as their scale and performance increase. Empirical analyses have been designed to support this idea and conjecture the possible alignment of different representations toward a shared statistical model of reality. In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is relating properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem.

Problem

Research questions and friction points this paper is trying to address.

Analyzing AI models' representation alignment

Exploring statistical model convergence

Linking stitching to kernel alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning-theoretic perspective on alignment

Stitching approach for representation interplay

Kernel alignment of underlying representation

🔎 Similar Papers

No similar papers found.

Authors to Follow