Data-driven Clustering and Merging of Adapters for On-device Large Language Models

📅 2026-01-24

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the challenge of deploying numerous task-specific adapters for on-device large language models under stringent memory constraints. To this end, it introduces a data-driven adapter clustering and fusion approach—the first of its kind—that requires only ten samples per task. By iteratively optimizing adapter representations, the method clusters similar adapters and merges those within the same cluster into a shared multi-task adapter. Integrated with parameter-efficient fine-tuning frameworks such as LoRA, this strategy significantly enhances cross-task generalization while adhering to tight storage budgets. Extensive experiments demonstrate the effectiveness and practicality of the proposed method on resource-constrained devices, establishing a new paradigm for efficient multi-task adapter deployment.

Technology Category

Application Category

📝 Abstract

On-device large language models commonly employ task-specific adapters (e.g., LoRAs) to deliver strong performance on downstream tasks. While storing all available adapters is impractical due to memory constraints, mobile devices typically have sufficient capacity to store a limited number of these parameters. This raises a critical challenge: how to select representative adapters that generalize well across multiple tasks - a problem that remains unexplored in existing literature. We propose a novel method D2C for adapter clustering that leverages minimal task-specific examples (e.g., 10 per task) and employs an iterative optimization process to refine cluster assignments. The adapters within each cluster are merged, creating multi-task adapters deployable on resource-constrained devices. Experimental results demonstrate that our method effectively boosts performance for considered storage budgets.

Problem

Research questions and friction points this paper is trying to address.

on-device large language models

adapter selection

multi-task generalization

memory constraints

task-specific adapters

Innovation

Methods, ideas, or system contributions that make the work stand out.

adapter clustering

on-device LLMs

data-driven merging