Mix-of-Language-Experts Architecture for Multilingual Programming

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) struggle to balance efficiency and language-specific adaptation in multilingual programming tasks. Method: This paper proposes MoLE (Mixture of Language Experts), a novel architecture integrating shared LoRA modules with language-specific LoRA experts. During training, joint optimization enables cross-lingual knowledge sharing; during inference, input-driven routing dynamically selects the appropriate language expert, achieving both parameter efficiency and linguistic specialization. Contribution/Results: On code understanding, generation, and translation benchmarks, MoLE significantly outperforms monolingual shared models—improving multilingual code generation accuracy—while using far fewer parameters than per-language LoRA fine-tuning. Empirical results demonstrate that MoLE enhances language adaptability without increasing computational overhead, validating its effectiveness in resource-efficient multilingual code modeling.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated impressive capabilities in aiding developers with tasks like code comprehension, generation, and translation. Supporting multilingual programming -- i.e., coding tasks across multiple programming languages -- typically requires either (1) finetuning a single LLM across all programming languages, which is cost-efficient but sacrifices language-specific specialization and performance, or (2) finetuning separate LLMs for each programming language, which allows for specialization but is computationally expensive and storage-intensive due to the duplication of parameters. This paper introduces MoLE (Mix-of-Language-Experts), a novel architecture that balances efficiency and specialization for multilingual programming. MoLE is composed of a base model, a shared LoRA (low-rank adaptation) module, and a collection of language-specific LoRA modules. These modules are jointly optimized during the finetuning process, enabling effective knowledge sharing and specialization across programming languages. During inference, MoLE automatically routes to the language-specific LoRA module corresponding to the programming language of the code token being generated. Our experiments demonstrate that MoLE achieves greater parameter efficiency compared to training separate language-specific LoRAs, while outperforming a single shared LLM finetuned for all programming languages in terms of accuracy.
Problem

Research questions and friction points this paper is trying to address.

Balancing efficiency and specialization in multilingual programming models
Reducing computational costs while maintaining language-specific performance
Optimizing parameter usage in multilingual code generation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mix-of-Language-Experts architecture balances efficiency and specialization
Combines shared and language-specific LoRA modules for optimization
Automatically routes to language-specific modules during inference
🔎 Similar Papers
No similar papers found.