Transfer Learning for T-Cell Response Prediction

📅 2024-03-18
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

182K/year
🤖 AI Summary
Predicting T-cell responses to peptide sequences is critical for personalized cancer vaccine development, yet models suffer from “shortcut learning” due to limited sample sizes and heterogeneous, multi-source data—leading them to rely on spurious domain-specific cues (e.g., biological origin) rather than genuine immunogenic signals. To address this, we propose a domain-aware evaluation paradigm that systematically identifies and quantifies such bias for the first time. We further introduce a cross-species independent fine-tuning strategy, integrating a Transformer architecture with strict domain-isolated evaluation to enable robust domain-adaptive transfer learning. Our method achieves state-of-the-art performance on human peptide immunogenicity prediction, significantly outperforming existing baselines. Moreover, it demonstrates strong generalization to non-human peptides—including murine and viral sequences—establishing a novel paradigm for cross-species immunogenicity modeling.

Technology Category

Application Category

📝 Abstract
We study the prediction of T-cell response for specific given peptides, which could, among other applications, be a crucial step towards the development of personalized cancer vaccines. It is a challenging task due to limited, heterogeneous training data featuring a multi-domain structure; such data entail the danger of shortcut learning, where models learn general characteristics of peptide sources, such as the source organism, rather than specific peptide characteristics associated with T-cell response. Using a transformer model for T-cell response prediction, we show that the danger of inflated predictive performance is not merely theoretical but occurs in practice. Consequently, we propose a domain-aware evaluation scheme. We then study different transfer learning techniques to deal with the multi-domain structure and shortcut learning. We demonstrate a per-source fine tuning approach to be effective across a wide range of peptide sources and further show that our final model is competitive with existing state-of-the-art approaches for predicting T-cell responses for human peptides.
Problem

Research questions and friction points this paper is trying to address.

Predict T-cell response for peptides
Address limited heterogeneous training data
Prevent shortcut learning in models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer model application
Domain-aware evaluation scheme
Per-source fine tuning approach
J
Josua Stadelmaier
Department of Computer Science, University of Tübingen, Tübingen, 72076, Germany
Brandon Malone
Brandon Malone
NEC OncoImmunity, Oslo, Norway
R
Ralf Eggeling
Department of Computer Science, University of Tübingen, Tübingen, 72076, Germany