🤖 AI Summary
Existing LoRA-based domain generalization methods rely on task labels or additional fine-tuning and activate all LoRA modules uniformly, leading to either parameter redundancy or insufficiency. This work proposes the first training-free, adaptive hierarchical LoRA routing framework. It introduces Rank-One Components (ROCs) as fundamental units and designs a dual-level routing mechanism: sequence-level selection—based on Gaussian likelihood derived from LoRA’s structural properties—and token-level routing—enabling fine-grained, dynamic module activation. We theoretically prove that the framework selects the most relevant LoRA modules with high probability. Without any training, our method achieves cross-domain generalization while preserving inference efficiency, improving accuracy by up to 55% over prior state-of-the-art approaches.
📝 Abstract
Low-Rank Adaptation (LoRA) has emerged as a widely used technique for adapting large language models (LLMs) to new domains, due to its modular design and broad availability on platforms such as HuggingFace. This availability has motivated efforts to reuse existing LoRAs for domain generalization.
However, existing methods often rely on explicit task labels or additional training, which are impractical for deployment. Moreover, they typically activate a fixed number of entire LoRA modules, leading to parameter redundancy or insufficiency that degrade performance.
In this paper, we propose exttt{HiLoRA}, a training-free framework that performs adaptive hierarchical routing over LoRA pools. Drawing on structural properties of LoRA, we define rank-one components (ROCs), in which each rank parameter is regarded as an independent unit. For a given input sequence, exttt{HiLoRA} first adaptively selects a subset of LoRAs and determines their ROC allocation based on Gaussian likelihoods at the sequence level. At the token level, it further refines routing by activating only the most informative ROCs.
We further provide theoretical guarantees that exttt{HiLoRA} selects the most relevant LoRAs with high probability.
Extensive experiments show that exttt{HiLoRA} achieves substantial improvements in domain generalization, with accuracy gains of up to {small $55%$} over state-of-the-art baselines, while maintaining comparable inference throughput.