An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring

๐Ÿ“… 2025-05-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Multi-agent large language model (LLM) systems are vulnerable to adversarial or low-performing agents, leading to unreliable outputs. To address this, we propose the first adversarial-robust framework for such systems, modeling collaborative question answering as an iterative game. Our method innovatively introduces a history-aware mechanism that adaptively learns agent trustworthiness and performs trust-weighted output aggregation. It integrates four key components: trustworthiness modeling, game-theoretic collaboration, robust aggregation, and adversarial training-based evaluation. Extensive experiments across diverse tasks and settings demonstrate that our framework significantly mitigates malicious interference, improving both accuracy and output stability. Notably, it maintains high performance even under extreme conditions where adversarial agents constitute over 50% of the systemโ€”marking the first empirical validation of robustness in majority-adversary regimes.

Technology Category

Application Category

๐Ÿ“ Abstract
While multi-agent LLM systems show strong capabilities in various domains, they are highly vulnerable to adversarial and low-performing agents. To resolve this issue, in this paper, we introduce a general and adversary-resistant multi-agent LLM framework based on credibility scoring. We model the collaborative query-answering process as an iterative game, where the agents communicate and contribute to a final system output. Our system associates a credibility score that is used when aggregating the team outputs. The credibility scores are learned gradually based on the past contributions of each agent in query answering. Our experiments across multiple tasks and settings demonstrate our system's effectiveness in mitigating adversarial influence and enhancing the resilience of multi-agent cooperation, even in the adversary-majority settings.
Problem

Research questions and friction points this paper is trying to address.

Multi-agent LLM systems vulnerable to adversarial agents
Need adversary-resistant framework via credibility scoring
Enhancing resilience in adversary-majority settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversary-resistant multi-agent LLM framework
Credibility scoring for agent aggregation
Iterative game-based collaborative query-answering
S
Sana Ebrahimi
University of Illinois Chicago
M
Mohsen Dehghankar
University of Illinois Chicago
Abolfazl Asudeh
Abolfazl Asudeh
Associate Professor of Computer Science, University of Illinois Chicago
Algorithms for AI and DataComputational GeometryResponsible AI