Is Your LLM-Based Multi-Agent a Reliable Real-World Planner? Exploring Fraud Detection in Travel Planning

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Large language model (LLM)-based multi-agent planning systems face reliability risks in real-world deployment due to reliance on public platforms containing deceptive information—such as fake reviews and misleading descriptions. Method: This paper introduces WandaPlan, the first evaluation framework designed for realistic travel planning scenarios, featuring a fraud-simulation environment that systematically assesses multi-agent robustness under three fraud categories: misinformation, collusive group behavior, and multi-round adversarial escalation. It further proposes a plug-and-play anti-fraud agent integrating dynamic risk perception and an adversarial evaluation protocol. Contribution/Results: Experiments reveal that mainstream planning systems suffer from high misclassification rates due to neglecting data authenticity. Validated across multiple open-source systems, WandaPlan significantly improves fraud detection: the anti-fraud agent boosts identification accuracy by 42.6%, demonstrating its effectiveness in enhancing planning reliability under deceptive conditions.

Technology Category

Application Category

📝 Abstract

The rise of Large Language Model-based Multi-Agent Planning has leveraged advanced frameworks to enable autonomous and collaborative task execution. Some systems rely on platforms like review sites and social media, which are prone to fraudulent information, such as fake reviews or misleading descriptions. This reliance poses risks, potentially causing financial losses and harming user experiences. To evaluate the risk of planning systems in real-world applications, we introduce extbf{WandaPlan}, an evaluation environment mirroring real-world data and injected with deceptive content. We assess system performance across three fraud cases: Misinformation Fraud, Team-Coordinated Multi-Person Fraud, and Level-Escalating Multi-Round Fraud. We reveal significant weaknesses in existing frameworks that prioritize task efficiency over data authenticity. At the same time, we validate WandaPlan's generalizability, capable of assessing the risks of real-world open-source planning frameworks. To mitigate the risk of fraud, we propose integrating an anti-fraud agent, providing a solution for reliable planning.

Problem

Research questions and friction points this paper is trying to address.

Evaluating fraud risks in LLM-based multi-agent travel planning systems

Assessing system vulnerabilities to misinformation and deceptive content

Proposing anti-fraud solutions for reliable real-world task execution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces WandaPlan for fraud detection evaluation

Assesses three fraud cases in planning systems

Proposes anti-fraud agent for reliable planning

🔎 Similar Papers

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation