CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging

📅 2026-02-28

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing MoE-LoRA approaches suffer from poor parameter efficiency and coarse adaptation granularity due to expert proliferation and instance-level routing. To address these limitations, this work proposes CoMoL, a framework that stores experts in a compact core matrix and employs token-level dynamic routing within a shared core space, coupled with a soft merging strategy. This design enables fine-grained, input-adaptive fine-tuning while preserving expert diversity. Furthermore, the routing network is projected into a low-rank subspace, substantially improving parameter efficiency without compromising representational capacity. Experimental results demonstrate that CoMoL consistently outperforms current methods across multiple tasks, achieving parameter efficiency close to that of standard LoRA while retaining the flexibility inherent to MoE-LoRA.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) achieve remarkable performance on diverse downstream and domain-specific tasks via parameter-efficient fine-tuning (PEFT). However, existing PEFT methods, particularly MoE-LoRA architectures, suffer from limited parameter efficiency and coarse-grained adaptation due to the proliferation of LoRA experts and instance-level routing. To address these issues, we propose Core Space Mixture of LoRA (\textbf{CoMoL}), a novel MoE-LoRA framework that incorporates expert diversity, parameter efficiency, and fine-grained adaptation. Specifically, CoMoL introduces two key components: core space experts and core space routing. Core space experts store each expert in a compact core matrix, preserving diversity while controlling parameter growth. Core space routing dynamically selects and activates the appropriate core experts for each token, enabling fine-grained, input-adaptive routing. Activated core experts are then merged via a soft-merging strategy into a single core expert, which is combined with a shared LoRA to form a specialized LoRA module. Besides, the routing network is projected into the same low-rank space as the LoRA matrices, further reducing parameter overhead without compromising expressiveness. Extensive experiments demonstrate that CoMoL retains the adaptability of MoE-LoRA architectures while achieving parameter efficiency comparable to standard LoRA, consistently outperforming existing methods across multiple tasks.

Problem

Research questions and friction points this paper is trying to address.

parameter-efficient fine-tuning

MoE-LoRA

expert proliferation

coarse-grained adaptation

LoRA experts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Core Space Experts

Core Space Routing

Parameter-Efficient Fine-Tuning