đ€ AI Summary
To address geometric distortion and insufficient task-signal capture when transferring pretrained Transformer embeddings for antimicrobial peptide (AMP) designâparticularly under data-scarce supervisionâthis paper proposes the âFreeze, Diffuse, Decodeâ (FDD) framework. FDD freezes pretrained embeddings to preserve their intrinsic manifold geometry, introduces graph diffusion to efficiently propagate supervised signals over the frozen manifold, and couples a lightweight decoder for downstream adaptation. Unlike fine-tuning or probing, FDD avoids manifold distortion and overfitting, yielding significant performance gains in few-shot settings. By integrating manifold learning with contrastive learning, the resulting representations are low-dimensional, interpretable, and highly predictive. They support diverse downstream tasksâincluding property prediction, similarity retrieval, and latent-space interpolationâestablishing a novel geometric-aware transfer learning paradigm for biological sequences.
đ Abstract
Pretrained transformers provide rich, general-purpose embeddings, which are transferred to downstream tasks. However, current transfer strategies: fine-tuning and probing, either distort the pretrained geometric structure of the embeddings or lack sufficient expressivity to capture task-relevant signals. These issues become even more pronounced when supervised data are scarce. Here, we introduce Freeze, Diffuse, Decode (FDD), a novel diffusion-based framework that adapts pre-trained embeddings to downstream tasks while preserving their underlying geometric structure. FDD propagates supervised signal along the intrinsic manifold of frozen embeddings, enabling a geometry-aware adaptation of the embedding space. Applied to antimicrobial peptide design, FDD yields low-dimensional, predictive, and interpretable representations that support property prediction, retrieval, and latent-space interpolation.