🤖 AI Summary
This study investigates how representational structures across modules—feedforward encoder, recurrent module, and readout module—of neural foundation models differ and affect their alignment with biological visual systems. We propose a physiology-inspired manifold analysis framework, introducing for the first time in foundation model interpretability the joint modeling of encoding manifolds (neurons → responses) and decoding manifolds (stimuli → activity), integrating manifold learning, parametric response modeling, and spatiotemporal pattern decomposition. Our results reveal: (1) fundamental geometric heterogeneity among the three modules’ representational manifolds; (2) the recurrent module enhances discriminability by dynamically “repelling” temporal patterns in latent space; and (3) while the readout module achieves high biological fidelity, its dedicated feature-map mechanism deviates from neurobiological plausibility. This work establishes a novel analytical paradigm for model–brain alignment and provides structured, geometry-aware interpretability grounded in neural computation principles.
📝 Abstract
Foundation models have shown remarkable success in fitting biological visual systems; however, their black-box nature inherently limits their utility for understanding brain function. Here, we peek inside a SOTA foundation model of neural activity (Wang et al., 2025) as a physiologist might, characterizing each'neuron'based on its temporal response properties to parametric stimuli. We analyze how different stimuli are represented in neural activity space by building decoding manifolds, and we analyze how different neurons are represented in stimulus-response space by building neural encoding manifolds. We find that the different processing stages of the model (i.e., the feedforward encoder, recurrent, and readout modules) each exhibit qualitatively different representational structures in these manifolds. The recurrent module shows a jump in capabilities over the encoder module by'pushing apart'the representations of different temporal stimulus patterns; while the readout module achieves biological fidelity by using numerous specialized feature maps rather than biologically plausible mechanisms. Overall, we present this work as a study of the inner workings of a prominent neural foundation model, gaining insights into the biological relevance of its internals through the novel analysis of its neurons'joint temporal response patterns.