More Agents Helps but Adversarial Robustness Gap Persists

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether multi-LLM agent collaboration improves both accuracy and adversarial robustness in mathematical question answering. Using the Agent Forest sampling-and-voting framework, we systematically evaluate ensembles of 1–5 agents under punctuation noise and realistic spelling errors (WikiTypo, R2ATA). Results show that agent collaboration substantially boosts accuracy on both clean and noisy inputs—with clear gains as ensemble size increases from 1 to 5—yet adversarial robustness consistently lags behind standard performance, with human-like spelling errors posing the primary bottleneck. Our key contribution is the first integration of multi-agent cooperation with fine-grained, semantics-preserving adversarial perturbations—particularly realistic spelling errors—in mathematical reasoning tasks. We quantitatively characterize the inherent accuracy–robustness trade-off, providing empirical foundations for designing trustworthy multi-agent systems.

Technology Category

Application Category

📝 Abstract
When LLM agents work together, they seem to be more powerful than a single LLM in mathematical question answering. However, are they also more robust to adversarial inputs? We investigate this question using adversarially perturbed math questions. These perturbations include punctuation noise with three intensities (10, 30, and 50 percent), plus real-world and human-like typos (WikiTypo, R2ATA). Using a unified sampling-and-voting framework (Agent Forest), we evaluate six open-source models (Qwen3-4B/14B, Llama3.1-8B, Mistral-7B, Gemma3-4B/12B) across four benchmarks (GSM8K, MATH, MMLU-Math, MultiArith), with various numbers of agents n from one to 25 (1, 2, 5, 10, 15, 20, 25). Our findings show that (1) Noise type matters: punctuation noise harm scales with its severity, and the human typos remain the dominant bottleneck, yielding the largest gaps to Clean accuracy and the highest ASR even with a large number of agents. And (2) Collaboration reliably improves accuracy as the number of agents, n, increases, with the largest gains from one to five agents and diminishing returns beyond 10 agents. However, the adversarial robustness gap persists regardless of the agent count.
Problem

Research questions and friction points this paper is trying to address.

Investigating multi-agent LLM robustness against adversarial math perturbations
Analyzing collaboration benefits versus persistent adversarial vulnerability gaps
Evaluating noise type impact and agent scaling effects on robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agent Forest framework enables multi-agent collaboration
Evaluates robustness against punctuation noise and human typos
Shows persistent adversarial gap despite increasing agent count
🔎 Similar Papers
No similar papers found.
K
Khashayar Alavi
Bonn-Aachen International Center for Information Technology, University of Bonn, Germany
Z
Zhastay Yeltay
Bonn-Aachen International Center for Information Technology, University of Bonn, Germany
Lucie Flek
Lucie Flek
University of Bonn, Lamarr Institute of Machine Learning and Artificial Intelligence
Natural Language ProcessingMachine LearningPhysicsComputational Social Sciences
Akbar Karimi
Akbar Karimi
University of Bonn, Lamarr Institute
Natural Language ProcessingAdversarial LearningRobustnessAI Safety