PEIRA: Learning Predictive Encoders through Inter-View Regressor Alignment

📅 2026-05-17

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Non-contrastive self-supervised learning often lacks a well-defined optimization objective, making its training dynamics and stability difficult to analyze. This work proposes a novel approach with an explicit objective function based on the Joint Embedding Predictive Architecture (JEPA), where the loss is defined via the trace of the optimal linear regressor between embeddings, augmented with trace regularization to control the effective dimensionality of the learned representations. This formulation provides, for the first time, a clear global minimum for non-contrastive SSL, and theoretical analysis shows that its stable equilibria correspond to nonlinear canonical correlation subspaces. Empirical evaluations demonstrate that the method achieves performance on par with VICReg and LeJEPA on ImageNet-1K and CIFAR-10, while qualitative results corroborate the theoretical predictions.

📝 Abstract

Non-contrastive self-supervised learning (SSL) is an effective framework for predictive representation learning, but popular (and in practice effective) methods such as SimSiam, BYOL, I-JEPA or DINO, which rely on a form of self-distillation to train a teacher-student network, remain poorly understood as they typically do not minimize a well-defined objective. We analyze the dynamics of a variant of the Joint Embedding Predictive Architecture (JEPA) using a regularized linear regressor to predict the learned representations of two views of the data from one another, and fully characterize its stability: non-collapsed stable equilibria align with leading nonlinear canonical correlation subspaces, while collapsed equilibria may also be stable attractors. Motivated by this result, we introduce PEIRA, a non-contrastive SSL method with an explicit objective defined through the trace of the optimal linear regressor. We show that its only stable equilibria are nontrivial global minimizers and recover the same canonical correlation subspaces, with regularization selecting the effective dimension. Experiments on ImageNet-1K and CIFAR-10 show PEIRA is competitive with VICReg and LeJEPA baselines, and qualitative empirical results support the theory.

Problem

Research questions and friction points this paper is trying to address.

self-supervised learning

non-contrastive learning

representation collapse

predictive representation

stability analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

non-contrastive self-supervised learning

predictive representation learning

canonical correlation analysis