Cell2Text: Multimodal LLM for Generating Single-Cell Descriptions from RNA-Seq Data

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Current single-cell foundation models produce only discrete labels, failing to deliver biologically rich, interpretable cell descriptions required by domain experts. To address this limitation, we propose the first generative framework that directly maps single-cell RNA-seq data to structured natural language descriptions—encompassing cell type, tissue origin, disease associations, and pathway activity. Our method employs a multimodal learning architecture that jointly leverages gene-level embeddings from a single-cell foundation model and a pretrained large language model, enabling end-to-end sequence-to-text generation. Experiments demonstrate that our framework outperforms classification baselines in label accuracy while achieving significant improvements in semantic fidelity (e.g., BERTScore) and ontology-aware similarity (e.g., Cell Ontology-based path similarity). Crucially, it delivers high interpretability through human-readable outputs and strong generalization across diverse datasets and cell types.

Technology Category

Application Category

📝 Abstract

Single-cell RNA sequencing has transformed biology by enabling the measurement of gene expression at cellular resolution, providing information for cell types, states, and disease contexts. Recently, single-cell foundation models have emerged as powerful tools for learning transferable representations directly from expression profiles, improving performance on classification and clustering tasks. However, these models are limited to discrete prediction heads, which collapse cellular complexity into predefined labels that fail to capture the richer, contextual explanations biologists need. We introduce Cell2Text, a multimodal generative framework that translates scRNA-seq profiles into structured natural language descriptions. By integrating gene-level embeddings from single-cell foundation models with pretrained large language models, Cell2Text generates coherent summaries that capture cellular identity, tissue origin, disease associations, and pathway activity, generalizing to unseen cells. Empirically, Cell2Text outperforms baselines on classification accuracy, demonstrates strong ontological consistency using PageRank-based similarity metrics, and achieves high semantic fidelity in text generation. These results demonstrate that coupling expression data with natural language offers both stronger predictive performance and inherently interpretable outputs, pointing to a scalable path for label-efficient characterization of unseen cells.

Problem

Research questions and friction points this paper is trying to address.

Generates natural language descriptions from single-cell RNA data

Translates gene expression profiles into structured cellular summaries

Overcomes limitations of discrete classification labels in biology

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates natural language descriptions from RNA-seq data

Integrates gene embeddings with large language models

Produces interpretable summaries of cellular characteristics

🔎 Similar Papers

No similar papers found.