Protein Circuit Tracing via Cross-layer Transcoders

πŸ“… 2026-02-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited interpretability of current protein language models, which struggle to reveal cross-layer computational mechanisms and reconstruct internal reasoning pathways. To overcome this, the authors propose ProtoMech, a novel framework that jointly learns sparse latent representations across all layers via a cross-layer transcoder, enabling end-to-end tracing of the model’s full computational circuitry. Evaluated on ESM2, ProtoMech successfully identifies compressed circuits highly aligned with protein structure and function, facilitating efficient protein design. Experiments show that ProtoMech recovers 82–89% of the original model’s performance on protein family classification and function prediction tasks, retains up to 79% accuracy using less than 1% of the latent space, and significantly outperforms existing protein design baselines in over 70% of cases.

Technology Category

Application Category

πŸ“ Abstract
Protein language models (pLMs) have emerged as powerful predictors of protein structure and function. However, the computational circuits underlying their predictions remain poorly understood. Recent mechanistic interpretability methods decompose pLM representations into interpretable features, but they treat each layer independently and thus fail to capture cross-layer computation, limiting their ability to approximate the full model. We introduce ProtoMech, a framework for discovering computational circuits in pLMs using cross-layer transcoders that learn sparse latent representations jointly across layers to capture the model's full computational circuitry. Applied to the pLM ESM2, ProtoMech recovers 82-89% of the original performance on protein family classification and function prediction tasks. ProtoMech then identifies compressed circuits that use<1% of the latent space while retaining up to 79% of model accuracy, revealing correspondence with structural and functional motifs, including binding, signaling, and stability. Steering along these circuits enables high-fitness protein design, surpassing baseline methods in more than 70% of cases. These results establish ProtoMech as a principled framework for protein circuit tracing.
Problem

Research questions and friction points this paper is trying to address.

protein language models
computational circuits
mechanistic interpretability
cross-layer computation
model interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-layer transcoders
protein language models
computational circuits
mechanistic interpretability
protein design
D
Darin Tsui
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA
K
Kunal Talreja
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA
D
Daniel Saeedi
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA
Amirali Aghazadeh
Amirali Aghazadeh
ECE, Georgia Tech
AIMachine LearningSignal ProcessingComputational BiologyMolecular Design