RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Video ad moderation faces core challenges including difficulty in fine-grained violation localization, weak interpretability, and insufficient generalization. To address these, we propose an active reinforcement reasoning framework for fine-grained violation detection. Our approach innovatively integrates active reinforcement learning, a hierarchical reward mechanism, reasoning distillation, and a curriculum-driven multi-stage training strategy—collectively enhancing the model’s precision in localizing violations within complex ad semantics and its capacity for interpretable, stepwise reasoning. Extensive evaluations on multiple public and proprietary benchmarks demonstrate that our method consistently outperforms general-purpose large language models (LLMs) and state-of-the-art specialized models (e.g., RAVEN). Both offline ablation studies and online A/B tests confirm its superior fine-grained comprehension, robust generalization across diverse ad domains, and tangible business impact in production deployment.

Technology Category

Application Category

📝 Abstract

Advertising (Ad) is a cornerstone of the digital economy, yet the moderation of video advertisements remains a significant challenge due to their complexity and the need for precise violation localization. While recent advancements, such as the RAVEN model, have improved coarse-grained violation detection, critical gaps persist in fine-grained understanding, explainability, and generalization. To address these limitations, we propose RAVEN++, a novel framework that introduces three key innovations: 1) Active Reinforcement Learning (RL), which dynamically adapts training to samples of varying difficulty; 2) Fine-Grained Violation Understanding, achieved through hierarchical reward functions and reasoning distillation; and 3) Progressive Multi-Stage Training, which systematically combines knowledge injection, curriculum-based passive RL, and active RL. Extensive experiments on both public and proprietary datasets, on both offline scenarios and online deployed A/B Testing, demonstrate that RAVEN++ outperforms general-purpose LLMs and specialized models like RAVEN in terms of fine-grained violation understanding, reasoning capabilities, and generalization ability.

Problem

Research questions and friction points this paper is trying to address.

Detecting fine-grained violations in complex advertisement videos precisely

Addressing limitations in explainability and generalization of existing models

Improving violation localization through active reinforcement learning methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Reinforcement Learning adapts training dynamically

Fine-grained violation understanding uses hierarchical reward functions

Progressive multi-stage training combines knowledge injection methods

🔎 Similar Papers

VideoPrism: A Foundational Visual Encoder for Video Understanding