🤖 AI Summary
To address weak cross-type generalization, high memory overhead, and non-modular architectures in multimodal large models for chart-to-code generation, this paper proposes a lightweight and efficient framework integrating Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). We design a complexity-aware expert routing mechanism that jointly incorporates domain-specialized experts and load-balanced sparse gating for input-adaptive expert assignment. LoRA is employed for parameter-efficient fine-tuning to enhance generalization capability and training stability. A customized training strategy further optimizes routing stability and semantic alignment accuracy. Evaluated on the Chart2Code-160k benchmark, our method achieves a 17% improvement in code generation accuracy, reduces peak GPU memory consumption by 18%, accelerates convergence by 20%, and notably enhances code quality—especially for complex charts.
📝 Abstract
Chart-to-code generation is a critical task in automated data visualization, translating complex chart structures into executable programs. While recent Multi-modal Large Language Models (MLLMs) improve chart representation, existing approaches still struggle to achieve cross-type generalization, memory efficiency, and modular design. To address these challenges, this paper proposes C2C-MoLA, a multimodal framework that synergizes Mixture of Experts (MoE) with Low-Rank Adaptation (LoRA). The MoE component uses a complexity-aware routing mechanism with domain-specialized experts and load-balanced sparse gating, dynamically allocating inputs based on learnable structural metrics like element count and chart complexity. LoRA enables parameter-efficient updates for resource-conscious tuning, further supported by a tailored training strategy that aligns routing stability with semantic accuracy. Experiments on Chart2Code-160k show that the proposed model improves generation accuracy by up to 17%, reduces peak GPU memory by 18%, and accelerates convergence by 20%, when compared to standard fine-tuning and LoRA-only baselines, particularly on complex charts. Ablation studies validate optimal designs, such as 8 experts and rank-8 LoRA, and confirm scalability for real-world multimodal code generation.