Parameter-Efficient Fine-Tuning of LLMs with Mixture of Space Experts

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing parameter-efficient fine-tuning methods, which operate solely within Euclidean space and struggle to capture the complex geometric structures inherent in linguistic data. To overcome this, we propose the Mixture of Space (MoS) framework—the first approach to integrate multiple non-Euclidean geometries, such as hyperbolic and spherical spaces, into parameter-efficient fine-tuning. MoS employs learnable curvatures and a lightweight routing mechanism to dynamically fuse representations from heterogeneous geometric experts. Building upon this framework, we extend LoRA to MoSLoRA, enabling input-adaptive selection and composition of geometric spaces. By transcending the constraints of a single manifold, our method achieves significant performance gains, improving accuracy by 5.6% on MATH500 and 15.9% on MAWPS over strong baselines, thereby demonstrating its superior representational capacity and adaptability.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have achieved remarkable progress, with Parameter-Efficient Fine-Tuning (PEFT) emerging as a key technique for downstream task adaptation. However, existing PEFT methods mainly operate in Euclidean space, fundamentally limiting their capacity to capture complex geometric structures inherent in language data. While alternative geometric spaces, like hyperbolic geometries for hierarchical data and spherical manifolds for circular patterns, offer theoretical advantages, forcing representations into a single manifold type ultimately limits expressiveness, even when curvature parameters are learnable. To address this, we propose Mixture of Space (MoS), a unified framework that leverages multiple geometric spaces simultaneously to learn richer, curvature-aware representations. Building on this scheme, we develop MoSLoRA, which extends Low-Rank Adaptation (LoRA) with heterogeneous geometric experts, enabling models to dynamically select or combine appropriate geometric spaces based on input context. Furthermore, to address the computational overhead of frequent manifold switching, we develop a lightweight routing mechanism. Moreover, we provide empirical insights into how curvature optimization impacts training stability and model performance. Our experiments across diverse benchmarks demonstrate that MoSLoRA consistently outperforms strong baselines, achieving up to 5.6% improvement on MATH500 and 15.9% on MAWPS.
Problem

Research questions and friction points this paper is trying to address.

Parameter-Efficient Fine-Tuning
Geometric Representation
Non-Euclidean Space
Large Language Models
Manifold Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Space
Parameter-Efficient Fine-Tuning
Geometric Representation Learning
LoRA
Curvature-Aware Adaptation
B
Buze Zhang
School of Software Engineering, Xi’an Jiaotong University, China
J
Jinkai Tao
School of Information, Central University of Finance and Economics, Beijing, China
Z
Zilang Zeng
Department of Artificial Intelligence, Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China
N
Neil He
Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
Ali Maatouk
Ali Maatouk
Yale University
Network OptimizationMachine LearningResource Allocation5GAge of Information
Menglin Yang
Menglin Yang
HKUST(GZ) | Yale University | CUHK
Hyperbolic Representation LearningTransformerRecommender SystemLLM
R
Rex Ying
Department of Computer Science, Yale University, New Haven, CT, USA