scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction

πŸ“… 2025-05-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

210K/year
πŸ€– AI Summary
To address the challenge of cancer drug resistance, this work introduces the first large-model benchmark framework for drug response prediction from single-cell data. Methodologically, it systematically integrates eight foundational single-cell models (e.g., scFoundation, scGPT, UCE) and two large language models (LLaMA, ChatGLM), proposing a dual-path adaptation strategy combining layer freezing and LoRA-based fine-tuning to enable zero-shot transfer and cross-dataset generalization. Comprehensive evaluation is conducted across 36 datasets comprising over 340,000 cells, covering both pooling-based and cross-dataset prediction paradigms. Results show that scFoundation achieves an F1 score of 0.971 under pooling, UCE attains 0.774 in cross-dataset fine-tuning, and scGPT reaches 0.858 in zero-shot inference; the best-performing model outperforms the worst by over 50%. The framework is open-sourced with CLI and web-based interfaces, advancing precision pharmacological modeling at single-cell resolution.

Technology Category

Application Category

πŸ“ Abstract
Drug resistance presents a major challenge in cancer therapy. Single cell profiling offers insights into cellular heterogeneity, yet the application of large-scale foundation models for predicting drug response in single cell data remains underexplored. To address this, we developed scDrugMap, an integrated framework featuring both a Python command-line interface and a web server for drug response prediction. scDrugMap evaluates a wide range of foundation models, including eight single-cell models and two large language models, using a curated dataset of over 326,000 cells in the primary collection and 18,800 cells in the validation set, spanning 36 datasets and diverse tissue and cancer types. We benchmarked model performance under pooled-data and cross-data evaluation settings, employing both layer freezing and Low-Rank Adaptation (LoRA) fine-tuning strategies. In the pooled-data scenario, scFoundation achieved the best performance, with mean F1 scores of 0.971 (layer freezing) and 0.947 (fine-tuning), outperforming the lowest-performing model by over 50%. In the cross-data setting, UCE excelled post fine-tuning (mean F1: 0.774), while scGPT led in zero-shot learning (mean F1: 0.858). Overall, scDrugMap provides the first large-scale benchmark of foundation models for drug response prediction in single-cell data and serves as a user-friendly, flexible platform for advancing drug discovery and translational research.
Problem

Research questions and friction points this paper is trying to address.

Evaluating foundation models for drug response prediction in single-cell data
Addressing drug resistance challenges in cancer therapy using large-scale models
Providing a benchmark and platform for drug discovery with diverse datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated framework with CLI and web server
Evaluates 10 foundation models on 326K cells
Employs layer freezing and LoRA fine-tuning
πŸ’Ό Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
Q
Qing Wang
Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32611, USA
Y
Yining Pan
Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32611, USA
M
Minghao Zhou
Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32611, USA
Z
Zijia Tang
Trinity College, Duke University, Durham, NC, USA
Y
Yanfei Wang
Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL 32611, USA
Guangyu Wang
Guangyu Wang
Houston Methodist
BioinformaticsComputational biologyAIepigenetics
Qianqian Song
Qianqian Song
Assistant Professor, University of Florida
Translational BioinformaticsBiomedical InformaticsArtificial Intelligence