🤖 AI Summary
This study systematically evaluates the reliability of large language model (LLM) agents in achieving consensus under Byzantine fault-tolerant settings without incentive mechanisms. The authors construct a synchronous, fully connected communication framework to simulate scalar Byzantine agreement and investigate the impact of model scale, group size, and Byzantine agent proportion through multidimensional experiments. Results reveal that even in entirely benign environments, LLM agents struggle to maintain stable consensus. The presence of only a few Byzantine agents drastically reduces consensus success rates, primarily due to loss of liveness rather than value corruption. These findings expose a fundamental fragility in current LLM agents’ capacity for distributed coordination, highlighting significant limitations in their robustness when deployed in adversarial multi-agent systems.
📝 Abstract
Large language models are increasingly deployed as cooperating agents, yet their behavior in adversarial consensus settings has not been systematically studied. We evaluate LLM-based agents on a Byzantine consensus game over scalar values using a synchronous all-to-all simulation. We test consensus in a no-stake setting where agents have no preferences over the final value, so evaluation focuses on agreement rather than value optimality. Across hundreds of simulations spanning model sizes, group sizes, and Byzantine fractions, we find that valid agreement is not reliable even in benign settings and degrades as group size grows. Introducing a small number of Byzantine agents further reduces success. Failures are dominated by loss of liveness, such as timeouts and stalled convergence, rather than subtle value corruption. Overall, the results suggest that reliable agreement is not yet a dependable emergent capability of current LLM-agent groups even in no-stake settings, raising caution for deployments that rely on robust coordination.