PlantBiMoE: A Bidirectional Foundation Model with SparseMoE for Plant Genomes

📅 2025-12-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Plant genomic modeling faces challenges of parameter redundancy and difficulty capturing bidirectional dependencies across DNA strands. This paper introduces PlantBiMoE, a lightweight and efficient foundational model that pioneers the integration of a bidirectional Mamba architecture with a sparse mixture-of-experts (SparseMoE) mechanism, enabling joint modeling of forward and reverse DNA strands while substantially reducing activated parameters. Trained and evaluated on the enhanced plant genomics benchmark MPGB, PlantBiMoE achieves state-of-the-art performance on 20 out of 31 downstream tasks, outperforming existing methods including AgroNT and PDLLMs in average performance. Key contributions are: (1) the first bidirectional state-space model specifically designed for plant genomes; (2) a structured sparsity mechanism that effectively balances representational capacity and computational efficiency; and (3) empirical validation of consistent performance gains from bidirectional modeling across diverse tasks—including functional element identification and variant effect prediction—demonstrating its broad applicability.

Technology Category

Application Category

📝 Abstract
Understanding the underlying linguistic rules of plant genomes remains a fundamental challenge in computational biology. Recent advances including AgroNT and PDLLMs have made notable progress although, they suffer from excessive parameter size and limited ability to model the bidirectional nature of DNA strands respectively. To address these limitations, we propose PlantBiMoE, a lightweight and expressive plant genome language model that integrates bidirectional Mamba and a Sparse Mixture-of-Experts (SparseMoE) framework. The bidirectional Mamba enables the model to effectively capture structural dependencies across both the forward and reverse DNA strands, while SparseMoE significantly reduces the number of active parameters, improving computational efficiency without sacrificing modeling capacity. We evaluated and tested our model on the Modified Plants Genome Benchmark (MPGB), an enhanced genomic benchmark, which consolidates 31 datasets across 11 representative tasks, with input sequence lengths ranging from 50 to 6,000 bp. Experimental results demonstrate that PlantBiMoE achieves the best performance on 20 out of 31 datasets and the average best when comparing with existing models. In summary, all above results demonstrate that our model can effectively represent plant genomic sequences, serving as a robust computational tool for diverse genomic tasks, while making substantive contributions to plant genomics, gene editing, and synthetic biology. The code is available at: https://github.com/HUST-Keep-Lin/PlantBiMoE
Problem

Research questions and friction points this paper is trying to address.

Modeling bidirectional dependencies in plant DNA strands effectively
Reducing excessive parameters in plant genome language models
Improving computational efficiency for diverse genomic sequence tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional Mamba captures DNA strand dependencies
SparseMoE reduces parameters for computational efficiency
Lightweight model excels on diverse genomic tasks
🔎 Similar Papers
No similar papers found.
K
Kepeng Lin
School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, 430074, China
Qizhe Zhang
Qizhe Zhang
School of Computer Science, Peking University
Vision Language ModelComputer VisionMachine Learning
R
Rui Wang
Hubei Hongshan Laboratory, Wuhan 430070, China; College of Informatics, Agricultural Bioinformatics Key Laboratory of Hubei Province, Huazhong Agricultural University, Wuhan, 430070, China
X
Xuehai Hu
Hubei Hongshan Laboratory, Wuhan 430070, China; College of Informatics, Agricultural Bioinformatics Key Laboratory of Hubei Province, Huazhong Agricultural University, Wuhan, 430070, China
W
Wei Xu
School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, 430074, China