Steering Protein Language Models

๐Ÿ“… 2025-07-01
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Protein language models (PLMs) struggle to precisely control the generation of sequences with desired functional or physicochemical properties. To address this, we propose a fine-tuning-free, plug-and-play activation editing frameworkโ€”the first to introduce activation steering into protein design. Our method comprises two components: (1) a learnable edit-site identification module that automatically locates latent-layer positions most sensitive to target attributes; and (2) a targeted activation intervention mechanism applied to evolutionarily pretrained PLMs, compatible with both autoencoding and autoregressive architectures. Evaluated on lysozyme-like sequence generation and optimization, our approach significantly improves functional property controllability while maintaining zero-shot adaptability across diverse PLMs without additional training.

Technology Category

Application Category

๐Ÿ“ Abstract
Protein Language Models (PLMs), pre-trained on extensive evolutionary data from natural proteins, have emerged as indispensable tools for protein design. While powerful, PLMs often struggle to produce proteins with precisely specified functionalities or properties due to inherent challenges in controlling their outputs. In this work, we investigate the potential of Activation Steering, a technique originally developed for controlling text generation in Large Language Models (LLMs), to direct PLMs toward generating protein sequences with targeted properties. We propose a simple yet effective method that employs activation editing to steer PLM outputs, and extend this approach to protein optimization through a novel editing site identification module. Through comprehensive experiments on lysozyme-like sequence generation and optimization, we demonstrate that our methods can be seamlessly integrated into both auto-encoding and autoregressive PLMs without requiring additional training. These results highlight a promising direction for precise protein engineering using foundation models.
Problem

Research questions and friction points this paper is trying to address.

Steering protein language models for precise control
Generating protein sequences with targeted properties
Overcoming challenges in controlling PLM outputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Activation Steering for protein sequence control
Activation editing without additional training
Novel editing site identification module
๐Ÿ”Ž Similar Papers
No similar papers found.