AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address the challenge of determining optimal data mixing ratios in multi-task alignment of large language models, this paper proposes AMA—the first theoretically guaranteed convergent adaptive mixing framework. AMA dynamically adjusts either task-specific data sampling weights (AMA-S) or loss weighting (AMA-R) to jointly optimize utility, harmlessness, and honesty. Its core innovation is a minimax balancing mechanism grounded in expert-model loss deviation, integrated with preference optimization (DPO), online learning (EXP3), and convex optimization analysis. AMA-S achieves an $O(1/sqrt{T})$ theoretical convergence rate. Empirical evaluation on multi-task alignment benchmarks shows AMA attains an average win rate 8.2% higher than standard joint training and model merging, significantly reducing reliance on manual hyperparameter tuning.

Technology Category

Application Category

📝 Abstract

When aligning large language models (LLMs), their performance on various tasks (such as being helpful, harmless, and honest) depends heavily on the composition of their training data. However, selecting a data mixture that achieves strong performance across all tasks is challenging. Existing approaches rely on large ablation studies, heuristics, or human intuition, but these can be prohibitively expensive and suboptimal. We study this problem in the setting of preference optimization via DPO and introduce AutoMixAlign (AMA), a theoretically-grounded algorithm that adaptively mixes datasets during training to balance performance across tasks. AMA first trains extit{specialist models} for each task to determine losses that correspond to strong task performance. Then, it trains a generalist model using a novel minimax optimization that prioritizes tasks for which generalist model losses deviate most from specialist model losses. To optimize this problem, we propose two algorithms: (1) AMA-R, which adaptively reweights the objective to prioritize tasks, and (2) AMA-S, which adaptively adjusts how much data is sampled from each task to prioritize tasks. Both algorithms achieve a convergence rate of $O(1/sqrt{T})$ in the convex case. AMA-R's convergence result follows from Sagawa et al. (2019), and we provide a convergence proof for AMA-S using online learning techniques such as EXP3. We evaluate AMA on several multitask alignment setups and find that AMA outperforms the standard alignment approach -- which simply optimizes the total loss across all tasks -- and also outperforms model merging methods.

Problem

Research questions and friction points this paper is trying to address.

Balancing performance across tasks in LLM alignment

Adaptively mixing datasets for multi-task preference optimization

Optimizing data composition without expensive ablation studies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive dataset mixing via minimax optimization

Specialist models guide generalist training

Two algorithms: AMA-R and AMA-S

🔎 Similar Papers

No similar papers found.