Enabling Flexible Multi-LLM Integration for Scalable Knowledge Aggregation

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face significant challenges in multi-model ensembling, including excessive memory overhead, inflexible weight merging, severe knowledge interference, and degraded task performance. Method: This paper proposes an adaptive multi-LLM ensemble framework featuring a novel feedback-regulated selection network and a dynamic weighted fusion mechanism, enabling task-aware dynamic selection of heterogeneous models and knowledge aggregation; it further introduces a feedback-driven loss function and a collaborative distillation strategy across multiple LLMs to mitigate knowledge conflicts. Contribution/Results: Experiments demonstrate that the method reduces knowledge interference by 50%, substantially improving ensemble stability, scalability, and multi-task robustness. Crucially, it achieves these gains without full-parameter fine-tuning, effectively circumventing the memory and adaptability bottlenecks inherent in conventional ensembling approaches.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown remarkable promise but remain challenging to continually improve through traditional finetuning, particularly when integrating capabilities from other specialized LLMs. Popular methods like ensemble and weight merging require substantial memory and struggle to adapt to changing data environments. Recent efforts have transferred knowledge from multiple LLMs into a single target model; however, they suffer from interference and degraded performance among tasks, largely due to limited flexibility in candidate selection and training pipelines. To address these issues, we propose a framework that adaptively selects and aggregates knowledge from diverse LLMs to build a single, stronger model, avoiding the high memory overhead of ensemble and inflexible weight merging. Specifically, we design an adaptive selection network that identifies the most relevant source LLMs based on their scores, thereby reducing knowledge interference. We further propose a dynamic weighted fusion strategy that accounts for the inherent strengths of candidate LLMs, along with a feedback-driven loss function that prevents the selector from converging on a single subset of sources. Experimental results demonstrate that our method can enable a more stable and scalable knowledge aggregation process while reducing knowledge interference by up to 50% compared to existing approaches. Code is avaliable at https://github.com/ZLKong/LLM_Integration
Problem

Research questions and friction points this paper is trying to address.

Enabling flexible integration of multiple LLMs for scalable knowledge aggregation
Reducing memory overhead and interference in multi-LLM knowledge transfer
Improving adaptability and performance in dynamic data environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive selection network for relevant LLMs
Dynamic weighted fusion for candidate strengths
Feedback-driven loss prevents subset convergence
🔎 Similar Papers
No similar papers found.
Zhenglun Kong
Zhenglun Kong
Harvard University
Efficient Deep LearningLarge Language ModelAI4Science
Zheng Zhan
Zheng Zhan
Microsoft Research
Deep learningMachine learningArtificial Intelligence
S
Shiyue Hou
Northeastern University
Y
Yifan Gong
Northeastern University
Xin Meng
Xin Meng
University of Pittsburgh
AI and medical imaging
P
Pengwei Sui
Harvard University
P
Peiyan Dong
Northeastern University
Xuan Shen
Xuan Shen
Cornell Tech, Northeastern University
Efficient Deep LearningML SystemsAutoML
Z
Zifeng Wang
Google
P
Pu Zhao
Northeastern University
H
Hao Tang
Peking University
Stratis Ioannidis
Stratis Ioannidis
Northeastern University
machine learningnetworking
Y
Yanzhi Wang
Northeastern University