Deep Minds and Shallow Probes

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Neural representations are non-unique due to coordinate reparameterization, causing probing results to depend on specific bases and undermining stability and cross-model transferability. To address this, this work leverages the affine symmetry of representations to construct a shallow polynomial probing hierarchy that is invariant under coordinate transformations. It further introduces the quotient space of representations—made visible through group actions—as a geometric foundation for cross-model transfer. Grounded in group action and quotient space theory, the proposed approach demonstrates in both synthetic and real-world tasks that higher-order probes (e.g., quadratic) outperform linear ones, and achieves, for the first time, effective cross-model transfer of probes based on quotient-space geometry.

📝 Abstract

Neural representations are not unique objects. Even when two systems realize the same downstream computation, their hidden coordinates may differ by reparameterization. A probe family intended to reveal structure already present in a representation should therefore be stable under the relevant representation symmetries rather than be tied to a particular basis. We study this group action in the tractable exact setting of the final readout layer, where equivalent realizations induce affine changes of hidden coordinates. The resulting symmetry principle singles out a unique hierarchy of shallow coordinate-stable probes, with linear probes as its degree-1 member. We also show that a natural object for cross-model probe transfer is a shared probe-visible quotient--the representation modulo directions invisible to the probe family--rather than the full hidden state. Experiments on synthetic and real-world tasks support both predictions, showing where degree-2 probes help beyond linear ones and how quotient-based transfer enables coverage-aware monitor portability across model families. These results point toward a broader geometric representation theory of neural probing, with coverage-aware monitor transfer as a concrete operational consequence.

Problem

Research questions and friction points this paper is trying to address.

neural representations

representation symmetries

probing

coordinate stability

probe transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

representation symmetry

coordinate-stable probing

probe quotient