SAFE-QAQ: End-to-End Slow-Thinking Audio-Text Fraud Detection via Reinforcement Learning

📅 2026-01-04

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work proposes the first end-to-end, ASR-free audio-level "slow-thinking" fraud detection framework that addresses the limitations of existing methods, which rely heavily on speech transcriptions and are thus vulnerable to recognition errors while neglecting critical acoustic cues such as prosody and environmental context. By leveraging a reinforcement learning–driven hierarchical reasoning mechanism, the model dynamically integrates fine-grained acoustic features to enable real-time risk assessment and early warning during phone calls. Evaluated on the TeleAntiFraud-Bench benchmark, the approach significantly outperforms current state-of-the-art models in accuracy, inference efficiency, and real-time performance. It has been deployed in production, analyzing over 70,000 calls daily, effectively reducing both manual review costs and financial losses.

Technology Category

Application Category

📝 Abstract

Existing fraud detection methods predominantly rely on transcribed text, suffering from ASR errors and missing crucial acoustic cues like vocal tone and environmental context. This limits their effectiveness against complex deceptive strategies. To address these challenges, we first propose \textbf{SAFE-QAQ}, an end-to-end comprehensive framework for audio-based slow-thinking fraud detection. First, the SAFE-QAQ framework eliminates the impact of transcription errors on detection performance. Secondly, we propose rule-based slow-thinking reward mechanisms that systematically guide the system to identify fraud-indicative patterns by accurately capturing fine-grained audio details, through hierarchical reasoning processes. Besides, our framework introduces a dynamic risk assessment framework during live calls, enabling early detection and prevention of fraud. Experiments on the TeleAntiFraud-Bench demonstrate that SAFE-QAQ achieves dramatic improvements over existing methods in multiple key dimensions, including accuracy, inference efficiency, and real-time processing capabilities. Currently deployed and analyzing over 70,000 calls daily, SAFE-QAQ effectively automates complex fraud detection, reducing human workload and financial losses. Code: https://anonymous.4open.science/r/SAFE-QAQ.

Problem

Research questions and friction points this paper is trying to address.

fraud detection

audio-text

ASR errors

acoustic cues

deceptive strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

slow-thinking

audio-text fraud detection

reinforcement learning