π€ AI Summary
This work addresses the challenges of representation collapse, label scarcity, and domain shift in drugβtarget affinity (DTA) prediction under cold-start scenarios by proposing Co-Diffusion, a novel framework that introduces, for the first time, an affinity-aware latent diffusion mechanism into DTA modeling. The approach adopts a two-stage paradigm: it first constructs an affinity-guided latent manifold and then incorporates modality-specific latent diffusion as a regularizer to mitigate the conflict between generative and regression objectives. By integrating supervised embedding alignment, modality-specific perturbations, and variational inference optimization, Co-Diffusion substantially enhances zero-shot generalization on unseen molecular scaffolds and novel protein families. Extensive experiments demonstrate that the method consistently outperforms state-of-the-art approaches across multiple benchmarks, offering robust support for virtual screening applications.
π Abstract
Predicting drug-target affinity is fundamental to virtual screening and lead optimization. However, existing deep models often suffer from representation collapse in stringent cold-start regimes, where the scarcity of labels and domain shifts prevent the learning of transferable pharmacophores and binding motifs. In this paper, we propose Co-Diffusion, a novel affinity-aware framework that redefines DTA prediction as a constrained latent denoising process to enhance generalization. Co-Diffusion employs a two-stage paradigm: Stage I establishes an affinity-steered latent manifold by aligning drug and target embeddings under an explicit supervised objective, ensuring that the latent space reflects the intrinsic binding landscape. Stage II introduces modality-specific latent diffusion as a stochastic perturb-and-denoise regularizer, forcing the model to recover consistent affinity semantics from noisy structural representations. This approach effectively mitigates the reconstruction-regression conflict common in generative DTA models. Theoretically, we show that Co-Diffusion maximizes a variational lower bound on the joint likelihood of drug structures, protein sequences, and binding strength. Extensive experiments across multiple benchmarks demonstrate that Co-Diffusion significantly outperforms state-of-the-art baselines, particularly yielding superior zero-shot generalization on unseen molecular scaffolds and novel protein families-paving a robust path for in silico drug prioritization in unexplored chemical spaces.