M3ST-DTI: A multi-task learning model for drug-target interactions based on multi-modal features and multi-stage alignment

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing drug–target interaction (DTI) prediction methods struggle to model deep intra-modal feature interactions and suffer from insufficient cross-modal alignment, limiting both predictive performance and generalizability. To address these challenges, we propose M3ST-DTI—a multimodal, multi-stage alignment framework. First, it introduces a two-phase alignment strategy: early-stage multi-scale contrastive alignment (MCA) with Gram-matrix structural regularization, followed by late-stage fine-grained bidirectional cross-attention (BCA). Second, it incorporates a deep orthogonal fusion module to suppress modality redundancy. Third, it integrates self-attention, hybrid pooling, and graph attention mechanisms to enable efficient representation learning and alignment of heterogeneous features—textual, structural, and functional. Extensive experiments on multiple benchmark datasets demonstrate that M3ST-DTI consistently outperforms state-of-the-art methods, achieving significant gains in prediction accuracy, robustness, and cross-dataset generalization.

Technology Category

Application Category

📝 Abstract
Accurate prediction of drug-target interactions (DTI) is pivotal in drug discovery. However, existing approaches often fail to capture deep intra-modal feature interactions or achieve effective cross-modal alignment, limiting predictive performance and generalization. To address these challenges, we propose M3ST-DTI, a multi-task learning model that enables multi-stage integration and alignment of multi modal features for DTI prediction. M3ST-DTI incorporates three types of features-textual, structural, and functional and enhances intra-modal representations using self-attention mechanisms and a hybrid pooling graph attention module. For early-stage feature alignment and fusion, the model in tegrates MCA with Gram loss as a structural constraint. In the later stage, a BCA module captures fine-grained interactions between drugs and targets within each modality, while a deep orthogonal fusion module mitigates feature redundancy.Extensive evaluations on benchmark datasets demonstrate that M3ST-DTI consistently outperforms state-of-the art methods across diverse metrics
Problem

Research questions and friction points this paper is trying to address.

Predicting drug-target interactions using multi-modal features and alignment
Improving intra-modal feature interactions with self-attention mechanisms
Enhancing cross-modal alignment through multi-stage fusion techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task learning with multi-modal feature integration
Self-attention and graph attention for intra-modal enhancement
Multi-stage alignment using cross-attention and orthogonal fusion
🔎 Similar Papers
2024-01-02IEEE International Conference on Bioinformatics and BiomedicineCitations: 0
X
Xiangyu Li
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ran Su
Ran Su
Tianjin University
Medical imagingbioinformatics
L
Liangliang Liu
College of Information and Management Science, Henan Agricultural University, Zhengzhou China