Protein Circuit Tracing via Cross-layer Transcoders

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the limited interpretability of current protein language models, which struggle to reveal cross-layer computational mechanisms and reconstruct internal reasoning pathways. To overcome this, the authors propose ProtoMech, a novel framework that jointly learns sparse latent representations across all layers via a cross-layer transcoder, enabling end-to-end tracing of the model’s full computational circuitry. Evaluated on ESM2, ProtoMech successfully identifies compressed circuits highly aligned with protein structure and function, facilitating efficient protein design. Experiments show that ProtoMech recovers 82–89% of the original model’s performance on protein family classification and function prediction tasks, retains up to 79% accuracy using less than 1% of the latent space, and significantly outperforms existing protein design baselines in over 70% of cases.

Technology Category

Application Category

📝 Abstract

Protein language models (pLMs) have emerged as powerful predictors of protein structure and function. However, the computational circuits underlying their predictions remain poorly understood. Recent mechanistic interpretability methods decompose pLM representations into interpretable features, but they treat each layer independently and thus fail to capture cross-layer computation, limiting their ability to approximate the full model. We introduce ProtoMech, a framework for discovering computational circuits in pLMs using cross-layer transcoders that learn sparse latent representations jointly across layers to capture the model's full computational circuitry. Applied to the pLM ESM2, ProtoMech recovers 82-89% of the original performance on protein family classification and function prediction tasks. ProtoMech then identifies compressed circuits that use<1% of the latent space while retaining up to 79% of model accuracy, revealing correspondence with structural and functional motifs, including binding, signaling, and stability. Steering along these circuits enables high-fitness protein design, surpassing baseline methods in more than 70% of cases. These results establish ProtoMech as a principled framework for protein circuit tracing.

Problem

Research questions and friction points this paper is trying to address.

protein language models

computational circuits

mechanistic interpretability

cross-layer computation

model interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-layer transcoders

protein language models

computational circuits