ValueFlow: Measuring the Propagation of Value Perturbations in Multi-Agent LLM Systems

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the unclear mechanisms by which value perturbations propagate and induce value drift in multi-agent large language model (LLM) systems. The authors propose ValueFlow, a novel framework that decouples value perturbation propagation into agent-level response behaviors and system-level structural effects. They introduce two quantifiable metrics—β-susceptibility and System Sensitivity (SS)—to elucidate the micro- and macro-scale dynamics of value drift. Leveraging Schwartz’s theory of basic human values, they construct a 56-dimensional value dataset and employ an LLM-as-a-judge evaluation protocol within simulated multi-agent interactions. Extensive experiments across diverse model architectures, personality prompts, value dimensions, and network topologies demonstrate that different values exhibit markedly varying susceptibility to perturbations, and that system structure critically shapes both the pathways and intensity of value propagation.

Technology Category

Application Category

📝 Abstract

Multi-agent large language model (LLM) systems increasingly consist of agents that observe and respond to one another's outputs. While value alignment is typically evaluated for isolated models, how value perturbations propagate through agent interactions remains poorly understood. We present ValueFlow, a perturbation-based evaluation framework for measuring and analyzing value drift in multi-agent systems. ValueFlow introduces a 56-value evaluation dataset derived from the Schwartz Value Survey and quantifies agents'value orientations during interaction using an LLM-as-a-judge protocol. Building on this measurement layer, ValueFlow decomposes value drift into agent-level response behavior and system-level structural effects, operationalized by two metrics: beta-susceptibility, which measures an agent's sensitivity to perturbed peer signals, and system susceptibility (SS), which captures how node-level perturbations affect final system outputs. Experiments across multiple model backbones, prompt personas, value dimensions, and network structures show that susceptibility varies widely across values and is strongly shaped by structural topology.

Problem

Research questions and friction points this paper is trying to address.

value propagation

multi-agent LLM systems

value drift

value alignment

agent interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

ValueFlow

value perturbation

multi-agent LLM systems