FinGuard: Detecting Financial Regulatory Non-Compliance in LLM Interactions

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the critical gap in existing large language models' inability to assess compliance with specific financial regulations, which poses significant regulatory risks in financial services. To tackle this challenge, we propose FinGuard—a novel approach that identifies compliance risks without requiring predefined violation categories by automatically generating annotated data directly from financial regulatory documents. We further introduce FinGuard-Bench, the first benchmark and risk taxonomy tailored to financial regulation compliance. Built upon Qwen3-8B and enhanced through supervised fine-tuning and self-play reinforcement learning, FinGuard adapts to new regulatory rules using only institutional policy documents. Evaluated on FinGuard-Bench, our method substantially outperforms strong baselines such as GPT-5.1 and Qwen3.5-397B-A17B while preserving robust general safety capabilities.

📝 Abstract

As large language models (LLMs) are increasingly deployed in financial services, a single non-compliant interaction can expose institutions to regulatory penalties and direct consumer harm. Existing guard models are built around general harm taxonomies and overlook violations grounded in specific financial regulations. We address this gap with a regulation-driven pipeline that operates directly on regulatory documents, inducing a financial compliance risk taxonomy and synthesizing grounded training data without any predefined violation categories. Instantiating the pipeline on Chinese financial regulations, we release \textbf{FinGuard-Bench}, to our knowledge the first benchmark for financial regulatory compliance detection, with expert-annotated labels at both the query and response levels. We further train \textbf{FinGuard}, a financial compliance detection model built on Qwen3-8B and trained on the regulation-grounded data via supervised fine-tuning and self-play reinforcement learning. On FinGuard-Bench, FinGuard substantially outperforms all baselines, including dedicated guard models and much larger general-purpose LLMs such as Qwen3.5-397B-A17B and GPT-5.1. Furthermore, FinGuard also preserves general safety capabilities and adapts to unseen institution-specific policies using policy documents alone. We will publicly release the code, prompts, and resources used in this work on GitHub.

Problem

Research questions and friction points this paper is trying to address.

financial regulatory compliance

large language models

non-compliance detection

guard models

regulatory violations

Innovation

Methods, ideas, or system contributions that make the work stand out.

financial regulatory compliance

LLM guardrails

regulation-grounded data synthesis