CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of cell type annotation in single-cell RNA sequencing, which are hindered by the tissue- and state-dependency of marker genes and the absence of reference profiles for novel cellular states. To overcome these challenges, we introduce CellMaster—the first large language model–based AI agent (e.g., leveraging GPT-4o) capable of zero-shot, automatic cell annotation without requiring pretraining or a fixed marker gene database. CellMaster emulates expert reasoning to deliver interpretable annotation rationales and enables real-time human-in-the-loop refinement. Evaluated across nine cross-tissue datasets, CellMaster achieves a 7.1% improvement in annotation accuracy over the best baseline in fully automated mode, which further increases to 18.6% with human-AI collaboration, with gains reaching 22.1% for rare subtypes.

Technology Category

Application Category

📝 Abstract
Single-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.
Problem

Research questions and friction points this paper is trying to address.

cell type annotation
single-cell RNA-seq
rare cell states
novel cell types
biological identity
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot cell annotation
large language model (LLM)
interpretable AI
human-in-the-loop
single-cell RNA-seq
🔎 Similar Papers
No similar papers found.
Zhen Wang
Zhen Wang
Postdoc at UCSD
Machine LearningLarge Language ModelsNatural Language Processing
Y
Yiming Gao
Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
J
Jieyuan Liu
Halicioglu Data Science Institute, University of California, San Diego, CA, USA
E
Enze Ma
Halicioglu Data Science Institute, University of California, San Diego, CA, USA
J
Jefferson Chen
Halicioglu Data Science Institute, University of California, San Diego, CA, USA
M
Mark Antkowiak
Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
M
Mengzhou Hu
Department of Medicine, University of California, San Diego, CA, USA
J
JungHo Kong
Department of Medicine, University of California, San Diego, CA, USA
Dexter Pratt
Dexter Pratt
UC San Diego
NDExCytoscapeNetwork BiologySystems Biology
Zhiting Hu
Zhiting Hu
Assistant Professor at UC San Diego
Machine LearningArtificial IntelligenceNatural Language Processing
W
Wei Wang
Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA, USA
Trey Ideker
Trey Ideker
University of California San Diego
CancerSystems BiologyNetworksBioinformatics
E
Eric P. Xing
Mohamed bin Zayed University of AI, Abu Dhabi, UAE; School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA