Flow-Based Fragment Identification via Binding Site-Specific Latent Representations

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

In fragment-based drug discovery, identifying initial weak-binding, low-specificity fragments remains a critical bottleneck. This paper introduces LatentFrag—the first protein-fragment joint modeling framework that integrates contrastive learning with conditional generation: it constructs a shared latent space to enable protein surface-guided, chemically valid fragment embedding learning and 3D pose generation. The method significantly improves binding site identification sensitivity and virtual screening (VS) efficiency, achieving state-of-the-art fragment recovery rates on standard benchmarks; it accelerates generation by two orders of magnitude over conventional VS approaches while drastically reducing computational cost. Furthermore, LatentFrag is extended to full ligand generation, demonstrating end-to-end de novo design capability.

Technology Category

Application Category

📝 Abstract

Fragment-based drug design is a promising strategy leveraging the binding of small chemical moieties that can efficiently guide drug discovery. The initial step of fragment identification remains challenging, as fragments often bind weakly and non-specifically. We developed a protein-fragment encoder that relies on a contrastive learning approach to map both molecular fragments and protein surfaces in a shared latent space. The encoder captures interaction-relevant features and allows to perform virtual screening as well as generative design with our new method LatentFrag. In LatentFrag, fragment embeddings and positions are generated conditioned on the protein surface while being chemically realistic by construction. Our expressive fragment and protein representations allow location of protein-fragment interaction sites with high sensitivity and we observe state-of-the-art fragment recovery rates when sampling from the learned distribution of latent fragment embeddings. Our generative method outperforms common methods such as virtual screening at a fraction of its computational cost providing a valuable starting point for fragment hit discovery. We further show the practical utility of LatentFrag and extend the workflow to full ligand design tasks. Together, these approaches contribute to advancing fragment identification and provide valuable tools for fragment-based drug discovery.

Problem

Research questions and friction points this paper is trying to address.

Identifying weakly binding fragments for drug design

Mapping protein-fragment interactions in shared latent space

Generating chemically realistic fragment embeddings for screening

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning for shared latent space mapping

LatentFrag generates conditioned fragment embeddings

Learned distribution enables state-of-art recovery rates

🔎 Similar Papers

No similar papers found.