Data-driven Clustering and Merging of Adapters for On-device Large Language Models

📅 2026-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying numerous task-specific adapters for on-device large language models under stringent memory constraints. To this end, it introduces a data-driven adapter clustering and fusion approach—the first of its kind—that requires only ten samples per task. By iteratively optimizing adapter representations, the method clusters similar adapters and merges those within the same cluster into a shared multi-task adapter. Integrated with parameter-efficient fine-tuning frameworks such as LoRA, this strategy significantly enhances cross-task generalization while adhering to tight storage budgets. Extensive experiments demonstrate the effectiveness and practicality of the proposed method on resource-constrained devices, establishing a new paradigm for efficient multi-task adapter deployment.

Technology Category

Application Category

📝 Abstract
On-device large language models commonly employ task-specific adapters (e.g., LoRAs) to deliver strong performance on downstream tasks. While storing all available adapters is impractical due to memory constraints, mobile devices typically have sufficient capacity to store a limited number of these parameters. This raises a critical challenge: how to select representative adapters that generalize well across multiple tasks - a problem that remains unexplored in existing literature. We propose a novel method D2C for adapter clustering that leverages minimal task-specific examples (e.g., 10 per task) and employs an iterative optimization process to refine cluster assignments. The adapters within each cluster are merged, creating multi-task adapters deployable on resource-constrained devices. Experimental results demonstrate that our method effectively boosts performance for considered storage budgets.
Problem

Research questions and friction points this paper is trying to address.

on-device large language models
adapter selection
multi-task generalization
memory constraints
task-specific adapters
Innovation

Methods, ideas, or system contributions that make the work stand out.

adapter clustering
on-device LLMs
data-driven merging
multi-task adaptation
parameter-efficient fine-tuning
🔎 Similar Papers
No similar papers found.
Ondrej Bohdal
Ondrej Bohdal
Samsung Research
Machine LearningDeep LearningComputer VisionNatural Language Processing
T
T. Ceritli
Samsung R&D Institute UK, United Kingdom
M
Mete Ozay
Samsung R&D Institute UK, United Kingdom
J
Jijoong Moon
Samsung Research, South Korea
K
Kyeng-Hun Lee
Samsung Research, South Korea
Hyeonmok Ko
Hyeonmok Ko
Principle Engineer, SAMSUNG ELECTRONICS CO. LTD.
Large Language ModelAINatural Language UnderstandingWireless CommunicationsNetwork Protocol
U
Umberto Michieli
Samsung R&D Institute UK, United Kingdom