Chart2Code-MoLA: Efficient Multi-Modal Code Generation via Adaptive Expert Routing

📅 2025-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak cross-type generalization, high memory overhead, and non-modular architectures in multimodal large models for chart-to-code generation, this paper proposes a lightweight and efficient framework integrating Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). We design a complexity-aware expert routing mechanism that jointly incorporates domain-specialized experts and load-balanced sparse gating for input-adaptive expert assignment. LoRA is employed for parameter-efficient fine-tuning to enhance generalization capability and training stability. A customized training strategy further optimizes routing stability and semantic alignment accuracy. Evaluated on the Chart2Code-160k benchmark, our method achieves a 17% improvement in code generation accuracy, reduces peak GPU memory consumption by 18%, accelerates convergence by 20%, and notably enhances code quality—especially for complex charts.

Technology Category

Application Category

📝 Abstract
Chart-to-code generation is a critical task in automated data visualization, translating complex chart structures into executable programs. While recent Multi-modal Large Language Models (MLLMs) improve chart representation, existing approaches still struggle to achieve cross-type generalization, memory efficiency, and modular design. To address these challenges, this paper proposes C2C-MoLA, a multimodal framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). The MoE component uses a complexity-aware routing mechanism with domain-specialized experts and load-balanced sparse gating, dynamically allocating inputs based on learnable structural metrics like element count and chart complexity. LoRA enables parameter-efficient updates for resource-conscious tuning, further supported by a tailored training strategy that aligns routing stability with semantic accuracy. Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%, reduces peak GPU memory by 18%, and accelerates convergence by 20%, when compared to standard fine-tuning and LoRA-only baselines, particularly on complex charts. Ablation studies validate optimal designs, such as 8 experts and rank-8 LoRA, and confirm scalability for real-world multimodal code generation.
Problem

Research questions and friction points this paper is trying to address.

Achieving cross-type generalization in chart-to-code generation
Improving memory efficiency for multimodal code generation models
Enhancing modular design through adaptive expert routing mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Experts with complexity-aware routing
Low-Rank Adaptation for parameter-efficient tuning
Load-balanced sparse gating for dynamic expert allocation
🔎 Similar Papers
No similar papers found.
Y
Yifei Wang
Department of Computer Science, City University of Hong Kong, Hong Kong, China
J
Jacky Keung
Department of Computer Science, City University of Hong Kong, Hong Kong, China
Z
Zhenyu Mao
Department of Computer Science, City University of Hong Kong, Hong Kong, China
Jingyu Zhang
Jingyu Zhang
WNLO Huazhong University of Science and Technology
optical
Yuchen Cao
Yuchen Cao
Carnegie Mellon University
Spatial ComputingComputer VisionArtificial IntelligenceExtended Reality