HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-task low-rank adaptation (LoRA) in large language models (LLMs) incurs substantial memory overhead due to task-specific adapter storage, while existing adapter merging methods suffer from significant performance degradation. Method: This paper proposes a tunable adapter merging framework grounded in matrix similarity, which quantifies the intrinsic similarity among LoRA weight matrices and formulates an optimization-driven merging strategy. Contribution/Results: Our approach enables continuous, controllable trade-offs between storage compression and task performance—overcoming the inflexibility of conventional fixed-compromise schemes. Empirical evaluation shows that, at just 52% of the original storage cost, the average performance drop is only 0.2–1.8%, substantially outperforming prior merging techniques. The method thus delivers both high efficiency and practical utility in resource-constrained deployment scenarios.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) often leverage adapters, such as low-rank-based adapters, to achieve strong performance on downstream tasks. However, storing a separate adapter for each task significantly increases memory requirements, posing a challenge for resource-constrained environments such as mobile devices. Although model merging techniques can reduce storage costs, they typically result in substantial performance degradation. In this work, we introduce HydraOpt, a new model merging technique that capitalizes on the inherent similarities between the matrices of low-rank adapters. Unlike existing methods that produce a fixed trade-off between storage size and performance, HydraOpt allows us to navigate this spectrum of efficiency and performance. Our experiments show that HydraOpt significantly reduces storage size (48% reduction) compared to storing all adapters, while achieving competitive performance (0.2-1.8% drop). Furthermore, it outperforms existing merging techniques in terms of performance at the same or slightly worse storage efficiency.
Problem

Research questions and friction points this paper is trying to address.

Balancing efficiency and performance in adapter merging for LLMs
Reducing memory usage of multiple adapters in resource-limited environments
Minimizing performance degradation while optimizing storage costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

HydraOpt merges low-rank adapters efficiently
Optimizes storage-performance trade-off dynamically
Reduces storage by 48% with minimal performance drop
🔎 Similar Papers
No similar papers found.
Taha Ceritli
Taha Ceritli
Samsung Research UK
Machine Learning
Ondrej Bohdal
Ondrej Bohdal
Samsung Research
Machine LearningDeep LearningComputer VisionNatural Language Processing
M
Mete Ozay
Samsung R&D Institute UK, United Kingdom
J
Jijoong Moon
Samsung Research, South Korea
K
Kyeng-Hun Lee
Samsung Research, South Korea
Hyeonmok Ko
Hyeonmok Ko
Principle Engineer, SAMSUNG ELECTRONICS CO. LTD.
Large Language ModelAINatural Language UnderstandingWireless CommunicationsNetwork Protocol
U
Umberto Michieli
Samsung R&D Institute UK, United Kingdom