WildActor: Unconstrained Identity-Preserving Video Generation

📅 2026-02-28

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing human video generation methods struggle to maintain full-body identity consistency under multi-view settings, dynamic camera motions, and large-scale body movements, often suffering from facial centralization or rigid poses. To address this, this work proposes WildActor, a novel framework that integrates an asymmetric identity-preserving attention mechanism and a viewpoint-adaptive Monte Carlo sampling strategy. By leveraging tri-view representation learning and margin-based utility-aware reweighting of reference conditions, WildActor enables high-fidelity, identity-consistent synthesis from arbitrary viewpoints. Furthermore, we introduce Actor-18M, the first large-scale human video dataset supporting unconstrained camera angles and environments, and establish Actor-Bench, a new benchmark on which WildActor demonstrates significant superiority over existing approaches in diverse shot compositions, extreme viewpoint transitions, and complex motion scenarios.

Technology Category

Application Category

📝 Abstract

Production-ready human video generation requires digital actors to maintain strictly consistent full-body identities across dynamic shots, viewpoints and motions, a setting that remains challenging for existing methods. Prior methods often suffer from face-centric behavior that neglects body-level consistency, or produce copy-paste artifacts where subjects appear rigid due to pose locking. We present Actor-18M, a large-scale human video dataset designed to capture identity consistency under unconstrained viewpoints and environments. Actor-18M comprises 1.6M videos with 18M corresponding human images, covering both arbitrary views and canonical three-view representations. Leveraging Actor-18M, we propose WildActor, a framework for any-view conditioned human video generation. We introduce an Asymmetric Identity-Preserving Attention mechanism coupled with a Viewpoint-Adaptive Monte Carlo Sampling strategy that iteratively re-weights reference conditions by marginal utility for balanced manifold coverage. Evaluated on the proposed Actor-Bench, WildActor consistently preserves body identity under diverse shot compositions, large viewpoint transitions, and substantial motions, surpassing existing methods in these challenging settings.

Problem

Research questions and friction points this paper is trying to address.

identity-preserving

video generation

full-body consistency

unconstrained viewpoints

human video synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identity-Preserving Video Generation

Asymmetric Identity-Preserving Attention

Viewpoint-Adaptive Monte Carlo Sampling