Gene Incremental Learning for Single-Cell Transcriptomics

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses catastrophic forgetting in token-level incremental learning—specifically, the forgetting of gene representations during continual learning of single-cell transcriptomic data—where genes serve as learnable “tokens.” We propose the **Gene-Incremental Learning (GIL) paradigm**, the first framework tailored to the unique characteristics of single-cell data. Methodologically, GIL integrates gene expression feature modeling, dynamic architecture expansion, and dedicated forgetting-mitigation strategies. We further introduce the first standardized GIL benchmark and evaluation protocol. Extensive experiments on multiple large-scale single-cell datasets demonstrate that GIL significantly alleviates gene-level forgetting while ensuring robustness and reproducibility. This work bridges a critical gap in token-level incremental learning within biomedicine and establishes foundational infrastructure and a novel paradigm for dynamic multimodal modeling of single-cell omics data.

Technology Category

Application Category

📝 Abstract
Classes, as fundamental elements of Computer Vision, have been extensively studied within incremental learning frameworks. In contrast, tokens, which play essential roles in many research fields, exhibit similar characteristics of growth, yet investigations into their incremental learning remain significantly scarce. This research gap primarily stems from the holistic nature of tokens in language, which imposes significant challenges on the design of incremental learning frameworks for them. To overcome this obstacle, in this work, we turn to a type of token, gene, for a large-scale biological dataset--single-cell transcriptomics--to formulate a pipeline for gene incremental learning and establish corresponding evaluations. We found that the forgetting problem also exists in gene incremental learning, thus we adapted existing class incremental learning methods to mitigate the forgetting of genes. Through extensive experiments, we demonstrated the soundness of our framework design and evaluations, as well as the effectiveness of our method adaptations. Finally, we provide a complete benchmark for gene incremental learning in single-cell transcriptomics.
Problem

Research questions and friction points this paper is trying to address.

Developing gene incremental learning for single-cell transcriptomics data
Addressing forgetting problem in gene incremental learning scenarios
Establishing comprehensive benchmark for biological token incremental learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapted class incremental learning for genes
Established gene incremental learning pipeline
Mitigated forgetting problem in gene learning
🔎 Similar Papers
No similar papers found.
J
Jiaxin Qi
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
Y
Yan Cui
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
Jianqiang Huang
Jianqiang Huang
Nanyang Technological University, Chinese Academy of Sciences
Compter VisionMachine LearningCasuality
G
Gaogang Xie
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China