AutoRedTrader: Autonomous Red Teaming of Trading Agents through Synthetic Misinformation Injection

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

181K/year
🤖 AI Summary
This work addresses the vulnerability of financial agents to subtle textual disinformation, which can induce significant decision biases while remaining highly stealthy. To tackle this challenge, the authors propose AutoRedTrader, the first autonomous red-teaming framework tailored for financial agents. AutoRedTrader integrates a partially observable Markov decision process (POMDP) with time-series evidence to generate domain-specific disinformation through strategies including behavioral bias manipulation, minimal textual perturbations, and targeted rewriting. A closed-loop feedback mechanism enables iterative refinement of attack efficacy. Evaluated on Bitcoin trading data, the framework achieves a disinformation exposure rate of 69.00% and an attack success rate of 26.67%, substantially outperforming existing general-purpose red-teaming and disinformation baselines.
📝 Abstract
LLM-based financial agents increasingly rely on both numerical market data and textual signals for sequential trading and stock prediction. However, financial misinformation often appears as subtle textual perturbations rather than explicit falsehoods, making it difficult to detect while still capable of significantly altering agent reasoning and decisions. To study this risk, we propose AutoRedTrader, an autonomous red-teaming framework that generates finance-specific misinformation through behavioral bias manipulation, minor textual perturbations, and rewriting strategies, with agent feedback used to strengthen attacks over time. We evaluate AutoRedTrader in a POMDP-based financial agent simulation environment, and further examine a time-series-informed grounding setting for robustness analysis. The framework enables systematic evaluation of how subtle misinformation affects financial agents and whether historical market evidence can stabilize decisions under misleading textual signals. We evaluate the framework on Bitcoin transaction data. The results show that AutoRedTrader achieves the strongest attack performance with 69.00% misinformation exposure rate and 26.67% attack success rate, outperforming general-purpose misinformation and red-teaming baselines. Ablation studies further show that all modules contribute to generating retrievable and decision-effective financial misinformation.
Problem

Research questions and friction points this paper is trying to address.

financial misinformation
trading agents
textual perturbations
autonomous red teaming
LLM-based agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous red teaming
financial misinformation
LLM-based trading agents
textual perturbation
behavioral bias manipulation
🔎 Similar Papers
No similar papers found.
💼 Related Jobs