qa-FLoRA: Data-free query-adaptive Fusion of LoRAs for LLMs

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

To address the challenge of dynamically fusing LoRA adapters for multi-domain composite queries, this paper proposes a data-free, training-free, query-adaptive fusion method. Leveraging distributional metrics—particularly KL divergence—we quantify, layer-wise, the divergence between hidden states of the base model and those induced by individual LoRA adapters; this enables real-time, query-driven generation of layer-specific fusion weights. Unlike static weighting or supervised fine-tuning, our approach achieves parameter-efficient and interpretable dynamic adapter scheduling. Evaluated on nine multilingual composite tasks, it outperforms static fusion by 5–6% and surpasses existing training-free baselines by 7–10%. Crucially, the learned weights exhibit coherent cross-layer semantic patterns, substantiating the method’s mechanistic soundness. The core contribution is the first fully training-free, label-free, query-aware LoRA fusion framework—eliminating both data dependency and gradient-based optimization while preserving expressivity and interpretability.

Technology Category

Application Category

📝 Abstract

The deployment of large language models for specialized tasks often requires domain-specific parameter-efficient finetuning through Low-Rank Adaptation (LoRA) modules. However, effectively fusing these adapters to handle complex, multi-domain composite queries remains a critical challenge. Existing LoRA fusion approaches either use static weights, which assign equal relevance to each participating LoRA, or require data-intensive supervised training for every possible LoRA combination to obtain respective optimal fusion weights. We propose qa-FLoRA, a novel query-adaptive data-and-training-free method for LoRA fusion that dynamically computes layer-level fusion weights by measuring distributional divergence between the base model and respective adapters. Our approach eliminates the need for composite training data or domain-representative samples, making it readily applicable to existing adapter collections. Extensive experiments across nine multilingual composite tasks spanning mathematics, coding, and medical domains, show that qa-FLoRA outperforms static fusion by ~5% with LLaMA-2 and ~6% with LLaMA-3, and the training-free baselines by ~7% with LLaMA-2 and ~10% with LLaMA-3, while significantly closing the gap with supervised baselines. Further, layer-level analysis of our fusion weights reveals interpretable fusion patterns, demonstrating the effectiveness of our approach for robust multi-domain adaptation.

Problem

Research questions and friction points this paper is trying to address.

Fusing multiple LoRA adapters for complex multi-domain queries without training data

Overcoming static weighting limitations in LoRA fusion for specialized LLM tasks

Eliminating need for data-intensive supervised training for optimal adapter combination

Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-adaptive dynamic fusion weights computed without data

Layer-level fusion using distribution divergence between base and adapters

Training-free method applicable to existing LoRA collections

🔎 Similar Papers

DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models