Multi-objective Large Language Model Alignment with Hierarchical Experts

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Diverse and often conflicting human preferences impede effective multi-objective alignment, making it difficult for existing methods to dynamically balance objectives without retraining—frequently resulting in Pareto-suboptimal solutions. Method: We propose the Hierarchical-of-Experts (HoE) architecture, integrating LoRA-based fine-tuning, Mixture-of-Experts (MoE), and preference-aware routing to enable parameter-efficient, plug-and-play, zero-training dynamic adaptation. Contribution/Results: HoE is the first method enabling large language models to generalize zero-shot across the entire Pareto frontier to arbitrary preference combinations. Evaluated across six benchmarks, 14 objectives, and 200 preference configurations, HoE consistently outperforms 15 state-of-the-art baselines, significantly improving multi-objective balance and cross-preference generalization capability.

Technology Category

Application Category

📝 Abstract

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce extit{HoE}(Hierarchical Mixture-of-Experts), a extit{lightweight}, extit{parameter-efficient}, and extit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, extit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate extit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.

Problem

Research questions and friction points this paper is trying to address.

Aligning LLMs to meet multiple conflicting human preferences

Balancing trade-offs without costly retraining or suboptimal results

Adapting LLMs across Pareto frontier for diverse user preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Mixture-of-Experts approach

Parameter-efficient plug-and-play solution

Optimizes Pareto frontier without retraining

🔎 Similar Papers

No similar papers found.