MATO: Multi-objective Personalized Alignment with Test-time Optimization for Large Language Models

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models struggle to flexibly align with users’ diverse and potentially conflicting multi-objective preferences without relying on training or predefined reward models. This work proposes the first training-free framework for multi-objective personalized alignment, formulating alignment as a test-time optimization problem. The approach directly elicits preference rewards through natural language and dynamically optimizes objective weights during decoding to balance multiple goals in real time—without modifying model parameters or requiring external reward models. Designed as a plug-and-play solution compatible with any large language model, the method consistently outperforms strong baselines across multiple datasets and backbone architectures, achieving Pareto-improved multi-objective alignment and enhanced controllability in generation.
📝 Abstract
Aligning large language models (LLMs) with diverse and multifaceted user preferences is a fundamental challenge in personalized AI systems. Existing multi-objective alignment methods either rely on costly training or require pre-trained reward models for each preference, making it difficult for them to adapt to evolving preferences. Prompt-based personalization offers a training-free alternative, but prompting alone often provides limited steerability, as LLMs may overemphasize or overlook certain preferences and fail to give users reliable control over the relative importance of different objectives when conflicts arise, leading to suboptimal alignment. In this paper, we introduce MATO, a training-free framework for Multi-objective personalized Alignment with Test-time Optimization. MATO formulates personalization as a test-time optimization problem that steers the relative importance of multiple objectives through controllable weights during decoding, without modifying model parameters or requiring external reward models. Specifically, a reward discovery module recovers preference rewards directly from the backbone LLM for diverse objectives specified in natural language, while a weight optimization module dynamically adjusts objective weights based on the user's initial preferences and the partially generated response to balance competing objectives during generation. The resulting rewards and weights jointly guide an online optimization procedure over the token distribution, enabling better alignment with the target objectives. Extensive experiments across multiple datasets and backbone LLMs show that MATO consistently outperforms strong baselines, achieving Pareto-improving multi-objective alignment and stronger steerability. These results highlight test-time optimization as a promising direction for scalable, controllable, and model-agnostic personalized alignment.
Problem

Research questions and friction points this paper is trying to address.

personalized alignment
multi-objective optimization
large language models
test-time optimization
preference steering
Innovation

Methods, ideas, or system contributions that make the work stand out.

test-time optimization
multi-objective alignment
personalized LLMs
reward discovery
dynamic weight adjustment
🔎 Similar Papers
2024-06-05arXiv.orgCitations: 1