🤖 AI Summary
Foreground-background separation in stage performance capture remains challenging due to dynamic lighting, motion blur in foreground performers, complex and cluttered backgrounds, and severe scarcity of pixel-level annotations—factors inadequately addressed by existing matting methods.
Method: This paper presents the first systematic analysis of these stage-specific challenges and introduces an active-intervention matting workflow tailored for theatrical content. It comprises (i) controllable shooting-stage design (e.g., optimized lighting and background setup) and (ii) a lightweight post-processing annotation adaptation mechanism that enables efficient integration of state-of-the-art matting models—including diffusion-prior-based architectures—without requiring extensive manual labeling. The framework supports both offline high-accuracy and real-time low-latency inference.
Results: Experiments on professional stage recordings demonstrate significant improvements in alpha matte quality (PSNR +2.1 dB, F-score +4.3%), validating its industrial viability and strong generalization across diverse theatrical productions.
📝 Abstract
Capture stages are high-end sources of state-of-the-art recordings for downstream applications in movies, games, and other media. One crucial step in almost all pipelines is the matting of images to isolate the captured performances from the background. While common matting algorithms deliver remarkable performance in other applications like teleconferencing and mobile entertainment, we found that they struggle significantly with the peculiarities of capture stage content. The goal of our work is to share insights into those challenges as a curated list of those characteristics along with a constructive discussion for proactive intervention and present a guideline to practitioners for an improved workflow to mitigate unresolved challenges. To this end, we also demonstrate an efficient pipeline to adapt state-of-the-art approaches to such custom setups without the need of extensive annotations, both offline and real-time. For an objective evaluation, we propose a validation methodology based on a leading diffusion model that highlights the benefits of our approach.