NbBench: Benchmarking Language Models for Comprehensive Nanobody Tasks

📅 2025-05-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The absence of a unified evaluation benchmark hinders the advancement of representation learning for nanobodies. Method: We introduce NbBench—the first comprehensive nanobody benchmark—encompassing eight tasks across structural annotation, antigen-binding prediction, and developability assessment, systematically evaluating 11 protein and antibody language models on nine high-quality datasets. We propose a frozen-weight evaluation paradigm and a multi-task standardization protocol to ensure fair, reproducible comparisons. Contribution/Results: Our evaluation reveals that antibody-specific language models excel in antigen-related tasks but consistently underperform on regression tasks such as thermal stability prediction; no single model dominates across all tasks. NbBench precisely delineates the capability boundaries of current models, provides an open-source, fully reproducible framework, and advances nanobody modeling toward task-decoupled design and model specialization.

Technology Category

Application Category

📝 Abstract
Nanobodies, single-domain antibody fragments derived from camelid heavy-chain-only antibodies, exhibit unique advantages such as compact size, high stability, and strong binding affinity, making them valuable tools in therapeutics and diagnostics. While recent advances in pretrained protein and antibody language models (PPLMs and PALMs) have greatly enhanced biomolecular understanding, nanobody-specific modeling remains underexplored and lacks a unified benchmark. To address this gap, we introduce NbBench, the first comprehensive benchmark suite for nanobody representation learning. Spanning eight biologically meaningful tasks across nine curated datasets, NbBench encompasses structure annotation, binding prediction, and developability assessment. We systematically evaluate eleven representative models--including general-purpose protein LMs, antibody-specific LMs, and nanobody-specific LMs--in a frozen setting. Our analysis reveals that antibody language models excel in antigen-related tasks, while performance on regression tasks such as thermostability and affinity remains challenging across all models. Notably, no single model consistently outperforms others across all tasks. By standardizing datasets, task definitions, and evaluation protocols, NbBench offers a reproducible foundation for assessing and advancing nanobody modeling.
Problem

Research questions and friction points this paper is trying to address.

Lack of unified benchmark for nanobody-specific modeling
Underexplored nanobody representation learning in biomolecular research
No single model excels across all nanobody-related tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces NbBench for nanobody representation learning
Evaluates eleven models across eight biological tasks
Standardizes datasets and evaluation protocols
🔎 Similar Papers
No similar papers found.
Y
Yiming Zhang
Department of Computational Biology and Medical Sciences, The University of Tokyo, Japan
Koji Tsuda
Koji Tsuda
Professor, GSFS, The University of Tokyo
Machine LearningComputational Biology