RISK: A Framework for GUI Agents in E-commerce Risk Management

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing GUI agents for e-commerce risk control are limited by single-step action execution and inadequate handling of dynamic web content, hindering multi-step, stateful interactions required to aggregate deep-embedded webpage data. Method: We propose RISK—a systematic framework featuring (i) the high-fidelity RISK-Data dataset and RISK-Bench evaluation benchmark; (ii) a multi-level reinforcement learning reward mechanism addressing format compliance, per-step accuracy, cross-step consistency, and task-difficulty reweighting; and (iii) R1-style offline RL fine-tuning. Results: RISK achieves +6.8% and +8.8% improvements over prior methods on offline single-step and multi-step tasks, respectively, and attains a 70.5% success rate in online evaluation—substantially outperforming existing approaches. Its core contributions include the first end-to-end, systematic framework for e-commerce risk-control GUI automation, a novel paradigm for modeling dynamic GUI interactions, and a generalizable, hierarchical reward design.

Technology Category

Application Category

📝 Abstract

E-commerce risk management requires aggregating diverse, deeply embedded web data through multi-step, stateful interactions, which traditional scraping methods and most existing Graphical User Interface (GUI) agents cannot handle. These agents are typically limited to single-step tasks and lack the ability to manage dynamic, interactive content critical for effective risk assessment. To address this challenge, we introduce RISK, a novel framework designed to build and deploy GUI agents for this domain. RISK integrates three components: (1) RISK-Data, a dataset of 8,492 single-step and 2,386 multi-step interaction trajectories, collected through a high-fidelity browser framework and a meticulous data curation process; (2) RISK-Bench, a benchmark with 802 single-step and 320 multi-step trajectories across three difficulty levels for standardized evaluation; and (3) RISK-R1, a R1-style reinforcement fine-tuning framework considering four aspects: (i) Output Format: Updated format reward to enhance output syntactic correctness and task comprehension, (ii) Single-step Level: Stepwise accuracy reward to provide granular feedback during early training stages, (iii) Multi-step Level: Process reweight to emphasize critical later steps in interaction sequences, and (iv) Task Level: Level reweight to focus on tasks of varying difficulty. Experiments show that RISK-R1 outperforms existing baselines, achieving a 6.8% improvement in offline single-step and an 8.8% improvement in offline multi-step. Moreover, it attains a top task success rate of 70.5% in online evaluation. RISK provides a scalable, domain-specific solution for automating complex web interactions, advancing the state of the art in e-commerce risk management.

Problem

Research questions and friction points this paper is trying to address.

Handling multi-step web interactions for e-commerce risk management

Managing dynamic content beyond traditional scraping limitations

Automating complex GUI tasks for effective risk assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

RISK framework integrates dataset, benchmark, and reinforcement fine-tuning

RISK-R1 uses multi-level rewards for enhanced interaction performance

Framework enables scalable automation of complex e-commerce web interactions

🔎 Similar Papers

No similar papers found.