Gene Incremental Learning for Single-Cell Transcriptomics

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses catastrophic forgetting in token-level incremental learning—specifically, the forgetting of gene representations during continual learning of single-cell transcriptomic data—where genes serve as learnable “tokens.” We propose the **Gene-Incremental Learning (GIL) paradigm**, the first framework tailored to the unique characteristics of single-cell data. Methodologically, GIL integrates gene expression feature modeling, dynamic architecture expansion, and dedicated forgetting-mitigation strategies. We further introduce the first standardized GIL benchmark and evaluation protocol. Extensive experiments on multiple large-scale single-cell datasets demonstrate that GIL significantly alleviates gene-level forgetting while ensuring robustness and reproducibility. This work bridges a critical gap in token-level incremental learning within biomedicine and establishes foundational infrastructure and a novel paradigm for dynamic multimodal modeling of single-cell omics data.

Technology Category

Application Category

📝 Abstract

Classes, as fundamental elements of Computer Vision, have been extensively studied within incremental learning frameworks. In contrast, tokens, which play essential roles in many research fields, exhibit similar characteristics of growth, yet investigations into their incremental learning remain significantly scarce. This research gap primarily stems from the holistic nature of tokens in language, which imposes significant challenges on the design of incremental learning frameworks for them. To overcome this obstacle, in this work, we turn to a type of token, gene, for a large-scale biological dataset--single-cell transcriptomics--to formulate a pipeline for gene incremental learning and establish corresponding evaluations. We found that the forgetting problem also exists in gene incremental learning, thus we adapted existing class incremental learning methods to mitigate the forgetting of genes. Through extensive experiments, we demonstrated the soundness of our framework design and evaluations, as well as the effectiveness of our method adaptations. Finally, we provide a complete benchmark for gene incremental learning in single-cell transcriptomics.

Problem

Research questions and friction points this paper is trying to address.

Developing gene incremental learning for single-cell transcriptomics data

Addressing forgetting problem in gene incremental learning scenarios

Establishing comprehensive benchmark for biological token incremental learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapted class incremental learning for genes

Established gene incremental learning pipeline

Mitigated forgetting problem in gene learning

🔎 Similar Papers

No similar papers found.

Authors to Follow