Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment

๐Ÿ“… 2025-02-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In multi-objective alignment (MOA), conflicting human preferences impede convergence to the Pareto frontier, causing gradient direction inconsistencies in DPO-based methods. To address this, we propose a self-improving DPO frameworkโ€”the first to integrate autonomous generation and Pareto-optimal response selection directly into the DPO pipeline. Specifically, an LLM performs self-reflection to generate diverse candidate responses; these are then filtered via multi-objective preference modeling and Pareto dominance testing, yielding high-quality self-supervised preference pairs. By bypassing explicit preference conflict resolution, our method enables end-to-end optimization toward the Pareto frontier. Evaluated on two standard MOA benchmarks, it achieves significant improvements in Pareto coverage and hypervolume, consistently outperforming existing MOA approaches.

Technology Category

Application Category

๐Ÿ“ Abstract
Multi-Objective Alignment (MOA) aims to align LLMs' responses with multiple human preference objectives, with Direct Preference Optimization (DPO) emerging as a prominent approach. However, we find that DPO-based MOA approaches suffer from widespread preference conflicts in the data, where different objectives favor different responses. This results in conflicting optimization directions, hindering the optimization on the Pareto Front. To address this, we propose to construct Pareto-optimal responses to resolve preference conflicts. To efficiently obtain and utilize such responses, we propose a self-improving DPO framework that enables LLMs to self-generate and select Pareto-optimal responses for self-supervised preference alignment. Extensive experiments on two datasets demonstrate the superior Pareto Front achieved by our framework compared to various baselines. Code is available at url{https://github.com/zyttt-coder/SIPO}.
Problem

Research questions and friction points this paper is trying to address.

Resolves preference conflicts in Multi-Objective Alignment
Enables self-generation of Pareto-optimal responses
Improves Pareto Front optimization in LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-improving DPO framework
Pareto-optimal response generation
Self-supervised preference alignment
๐Ÿ”Ž Similar Papers
No similar papers found.
Moxin Li
Moxin Li
National University of Singapore
natural language processing
Y
Yuantao Zhang
National University of Singapore
W
Wenjie Wang
National University of Singapore, University of Science and Technology of China
W
Wentao Shi
University of Science and Technology of China
Z
Zhuo Liu
University of Science and Technology of China
F
Fuli Feng
University of Science and Technology of China
T
Tat-Seng Chua
National University of Singapore