Enhancing Value Alignment of LLMs with Multi-agent system and Combinatorial Fusion

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of current value alignment approaches for large language models, which often rely on a single evaluator or narrow reward signals and thus fail to capture ethical pluralism. To overcome this, the authors propose a multi-agent framework in which each agent embodies a distinct normative perspective. They introduce, for the first time, a Compositional Fusion Analysis (CFA) mechanism that integrates multi-agent fine-tuning with a dual aggregation strategy combining ranking and scoring. This approach effectively mitigates value conflicts and redundancies inherent in diverse ethical viewpoints. Experimental results demonstrate that the proposed method significantly outperforms single-agent baselines and existing aggregation techniques across standard metrics, thereby enhancing the alignment of large language models with multifaceted ethical dimensions.

Technology Category

Application Category

📝 Abstract
Aligning large language models (LLMs) with human values is a central challenge for ensuring trustworthy and safe deployment. While existing methods such as Reinforcement Learning from Human Feedback (RLHF) and its variants have improved alignment, they often rely on a single evaluator or narrowly defined reward signals, limiting their ability to capture ethical pluralism. In this work, we propose the Value Alignment System using Combinatorial Fusion Analysis (VAS-CFA), a framework that operationalizes multi-agent fusion alignment. It instantiates multiple moral agents, each fine-tuned to represent a distinct normative perspective, and fuses their outputs using CFA with both rank- and score-based aggregation. This design leverages cognitive diversity, between agents, to mitigate conflicts and redundancies across multiple agents, producing responses that better reflect human values. Empirical evaluation demonstrates that VAS-CFA outperforms both single agent baselines and prior aggregation approaches on standard metrics, showing that multi-agent fusion provides a robust and effective mechanism for advancing value alignment in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Value Alignment
Large Language Models
Ethical Pluralism
Multi-agent System
Human Values
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent system
combinatorial fusion analysis
value alignment
large language models
ethical pluralism
🔎 Similar Papers
No similar papers found.
Y
Yuanhong Wu
Dept. of Computer and Information Science, Fordham University, New York, NY, USA
Djallel Bouneffouf
Djallel Bouneffouf
Unknown affiliation
Reinforcement learningMulti-armed banditsContext-aware Recommender systems
D
D. Frank Hsu
Dept. of Computer and Information Science, Fordham University, New York, NY, USA