Freeze, Diffuse, Decode: Geometry-Aware Adaptation of Pretrained Transformer Embeddings for Antimicrobial Peptide Design

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address geometric distortion and insufficient task-signal capture when transferring pretrained Transformer embeddings for antimicrobial peptide (AMP) design—particularly under data-scarce supervision—this paper proposes the “Freeze, Diffuse, Decode” (FDD) framework. FDD freezes pretrained embeddings to preserve their intrinsic manifold geometry, introduces graph diffusion to efficiently propagate supervised signals over the frozen manifold, and couples a lightweight decoder for downstream adaptation. Unlike fine-tuning or probing, FDD avoids manifold distortion and overfitting, yielding significant performance gains in few-shot settings. By integrating manifold learning with contrastive learning, the resulting representations are low-dimensional, interpretable, and highly predictive. They support diverse downstream tasks—including property prediction, similarity retrieval, and latent-space interpolation—establishing a novel geometric-aware transfer learning paradigm for biological sequences.

Technology Category

Application Category

📝 Abstract

Pretrained transformers provide rich, general-purpose embeddings, which are transferred to downstream tasks. However, current transfer strategies: fine-tuning and probing, either distort the pretrained geometric structure of the embeddings or lack sufficient expressivity to capture task-relevant signals. These issues become even more pronounced when supervised data are scarce. Here, we introduce Freeze, Diffuse, Decode (FDD), a novel diffusion-based framework that adapts pre-trained embeddings to downstream tasks while preserving their underlying geometric structure. FDD propagates supervised signal along the intrinsic manifold of frozen embeddings, enabling a geometry-aware adaptation of the embedding space. Applied to antimicrobial peptide design, FDD yields low-dimensional, predictive, and interpretable representations that support property prediction, retrieval, and latent-space interpolation.

Problem

Research questions and friction points this paper is trying to address.

Adapting pretrained embeddings while preserving geometric structure

Addressing data scarcity in supervised learning tasks

Creating interpretable representations for antimicrobial peptide design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Freeze, Diffuse, Decode framework preserves geometric structure

Diffusion propagates supervised signals along frozen embeddings

Creates interpretable representations for peptide property prediction

🔎 Similar Papers

Diffusion on language model encodings for protein sequence generation