ValueFlow: Measuring the Propagation of Value Perturbations in Multi-Agent LLM Systems

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unclear mechanisms by which value perturbations propagate and induce value drift in multi-agent large language model (LLM) systems. The authors propose ValueFlow, a novel framework that decouples value perturbation propagation into agent-level response behaviors and system-level structural effects. They introduce two quantifiable metrics—β-susceptibility and System Sensitivity (SS)—to elucidate the micro- and macro-scale dynamics of value drift. Leveraging Schwartz’s theory of basic human values, they construct a 56-dimensional value dataset and employ an LLM-as-a-judge evaluation protocol within simulated multi-agent interactions. Extensive experiments across diverse model architectures, personality prompts, value dimensions, and network topologies demonstrate that different values exhibit markedly varying susceptibility to perturbations, and that system structure critically shapes both the pathways and intensity of value propagation.

Technology Category

Application Category

📝 Abstract
Multi-agent large language model (LLM) systems increasingly consist of agents that observe and respond to one another's outputs. While value alignment is typically evaluated for isolated models, how value perturbations propagate through agent interactions remains poorly understood. We present ValueFlow, a perturbation-based evaluation framework for measuring and analyzing value drift in multi-agent systems. ValueFlow introduces a 56-value evaluation dataset derived from the Schwartz Value Survey and quantifies agents'value orientations during interaction using an LLM-as-a-judge protocol. Building on this measurement layer, ValueFlow decomposes value drift into agent-level response behavior and system-level structural effects, operationalized by two metrics: beta-susceptibility, which measures an agent's sensitivity to perturbed peer signals, and system susceptibility (SS), which captures how node-level perturbations affect final system outputs. Experiments across multiple model backbones, prompt personas, value dimensions, and network structures show that susceptibility varies widely across values and is strongly shaped by structural topology.
Problem

Research questions and friction points this paper is trying to address.

value propagation
multi-agent LLM systems
value drift
value alignment
agent interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

ValueFlow
value perturbation
multi-agent LLM systems
LLM-as-a-judge
system susceptibility
🔎 Similar Papers
No similar papers found.
J
Jinnuo Liu
Center for Data Science, NYU Shanghai, New York University
C
Chuke Liu
Center for Data Science, NYU Shanghai, New York University
Hua Shen
Hua Shen
Assistant Professor, NYU Shanghai / New York University
bidirectional human-AI alignmenthuman-AI interactionAI/LLM interpretability and evaluation