Data-driven Clustering and Merging of Adapters for On-device Large Language Models

📅 2026-01-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

240K/year
🤖 AI Summary
This work addresses the challenge of deploying numerous task-specific adapters for on-device large language models under stringent memory constraints. To this end, it introduces a data-driven adapter clustering and fusion approach—the first of its kind—that requires only ten samples per task. By iteratively optimizing adapter representations, the method clusters similar adapters and merges those within the same cluster into a shared multi-task adapter. Integrated with parameter-efficient fine-tuning frameworks such as LoRA, this strategy significantly enhances cross-task generalization while adhering to tight storage budgets. Extensive experiments demonstrate the effectiveness and practicality of the proposed method on resource-constrained devices, establishing a new paradigm for efficient multi-task adapter deployment.

Technology Category

Application Category

📝 Abstract
On-device large language models commonly employ task-specific adapters (e.g., LoRAs) to deliver strong performance on downstream tasks. While storing all available adapters is impractical due to memory constraints, mobile devices typically have sufficient capacity to store a limited number of these parameters. This raises a critical challenge: how to select representative adapters that generalize well across multiple tasks - a problem that remains unexplored in existing literature. We propose a novel method D2C for adapter clustering that leverages minimal task-specific examples (e.g., 10 per task) and employs an iterative optimization process to refine cluster assignments. The adapters within each cluster are merged, creating multi-task adapters deployable on resource-constrained devices. Experimental results demonstrate that our method effectively boosts performance for considered storage budgets.
Problem

Research questions and friction points this paper is trying to address.

on-device large language models
adapter selection
multi-task generalization
memory constraints
task-specific adapters
Innovation

Methods, ideas, or system contributions that make the work stand out.

adapter clustering
on-device LLMs
data-driven merging
multi-task adaptation
parameter-efficient fine-tuning