FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

📅 2026-01-09

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the critical gap in existing safety evaluations for financial AI agents, which often fail to account for state-altering risks in real-world execution environments and thus inadequately reflect hazards in highly regulated settings. To bridge this gap, we propose FinVault, the first execution-level safety benchmark for financial agents, built upon 31 real regulatory cases and featuring a sandboxed environment with a writable database and embedded compliance constraints to systematically assess agent behavior within authentic workflows. FinVault supports comprehensive testing across 107 vulnerability categories—including prompt injection, jailbreaking, and finance-specific attacks—encompassing 963 test cases, and incorporates benign inputs to measure false positive rates. Empirical results reveal that current defenses perform poorly in financial contexts: attack success rates reach as high as 50.0% for leading models and remain at 6.7% even for the best-performing system, underscoring the severe limitations of existing safety mechanisms when transferred to the financial domain.

Technology Category

Application Category

📝 Abstract

Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated decision-making, where their abilities to plan, invoke tools, and manipulate mutable state introduce new security risks in high-stakes and highly regulated financial environments. However, existing safety evaluations largely focus on language-model-level content compliance or abstract agent settings, failing to capture execution-grounded risks arising from real operational workflows and state-changing actions. To bridge this gap, we propose FinVault, the first execution-grounded security benchmark for financial agents, comprising 31 regulatory case-driven sandbox scenarios with state-writable databases and explicit compliance constraints, together with 107 real-world vulnerabilities and 963 test cases that systematically cover prompt injection, jailbreaking, financially adapted attacks, as well as benign inputs for false-positive evaluation. Experimental results reveal that existing defense mechanisms remain ineffective in realistic financial agent settings, with average attack success rates (ASR) still reaching up to 50.0\% on state-of-the-art models and remaining non-negligible even for the most robust systems (ASR 6.7\%), highlighting the limited transferability of current safety designs and the need for stronger financial-specific defenses. Our code can be found at https://github.com/aifinlab/FinVault.

Problem

Research questions and friction points this paper is trying to address.

financial agents

execution-grounded safety

security benchmark

state-changing actions

regulatory compliance

Innovation

Methods, ideas, or system contributions that make the work stand out.

execution-grounded benchmark

financial agent safety

state-writable sandbox