MTA: A Merge-then-Adapt Framework for Personalized Large Language Model

πŸ“… 2025-11-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Personalized large language models (PLLMs) face two key challenges: user-level fine-tuning incurs storage overhead that scales linearly with the number of users, and few-shot users struggle to achieve effective personalization. To address these, we propose MTAβ€”a Meta-Task-Aware framework that enables efficient and scalable personalization via meta-personalization decoupling. MTA constructs a shared Meta-LoRA Bank and integrates adaptive LoRA fusion with stacking to dynamically retrieve and compose user-specific low-rank adaptations. By unifying meta-learning, vector retrieval, and two-stage fine-tuning, MTA significantly reduces storage requirements while preserving adaptation fidelity. On the LaMP benchmark, MTA consistently outperforms state-of-the-art methods, delivering substantial gains in few-shot personalization accuracy without compromising scalability or generalization.

Technology Category

Application Category

πŸ“ Abstract
Personalized Large Language Models (PLLMs) aim to align model outputs with individual user preferences, a crucial capability for user-centric applications. However, the prevalent approach of fine-tuning a separate module for each user faces two major limitations: (1) storage costs scale linearly with the number of users, rendering the method unscalable; and (2) fine-tuning a static model from scratch often yields suboptimal performance for users with sparse data. To address these challenges, we propose MTA, a Merge-then-Adapt framework for PLLMs. MTA comprises three key stages. First, we construct a shared Meta-LoRA Bank by selecting anchor users and pre-training meta-personalization traits within meta-LoRA modules. Second, to ensure scalability and enable dynamic personalization combination beyond static models, we introduce an Adaptive LoRA Fusion stage. This stage retrieves and dynamically merges the most relevant anchor meta-LoRAs to synthesize a user-specific one, thereby eliminating the need for user-specific storage and supporting more flexible personalization. Third, we propose a LoRA Stacking for Few-Shot Personalization stage, which applies an additional ultra-low-rank, lightweight LoRA module on top of the merged LoRA. Fine-tuning this module enables effective personalization under few-shot settings. Extensive experiments on the LaMP benchmark demonstrate that our approach outperforms existing SOTA methods across multiple tasks.
Problem

Research questions and friction points this paper is trying to address.

Addresses storage scalability in personalized LLMs
Enhances performance for users with sparse data
Enables dynamic personalization through adaptive fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Merge-then-Adapt framework for personalized language models
Adaptive LoRA Fusion for dynamic personalization combination
LoRA Stacking enables effective few-shot personalization
πŸ”Ž Similar Papers
No similar papers found.
X
Xiaopeng Li
City University of Hong Kong
Yuanjin Zheng
Yuanjin Zheng
City University of Hong Kong, Zhejiang University
W
Wanyu Wang
City University of Hong Kong
W
wenlin zhang
City University of Hong Kong
Pengyue Jia
Pengyue Jia
PhD candidate of Data Science, City University of Hong Kong
Information RetrievalLarge Language ModelsGeoAI
Y
Yiqi Wang
City University of Hong Kong
M
Maolin Wang
City University of Hong Kong
Xuetao Wei
Xuetao Wei
Associate Professor, Southern University of Science and Technology
AI EthicsAI Safety
X
Xiangyu Zhao
City University of Hong Kong