π€ AI Summary
This work addresses the limitation of existing 2D-to-3D conversion methods, which, despite achieving geometric accuracy, often lack artistic expressiveness and fail to replicate the immersive depth and emotional resonance characteristic of professional 3D cinema. To bridge this gap, the authors propose a novel paradigm termed βartistic disparity synthesis,β implemented through the Art3D framework. This approach decouples global depth parameters from local artistic effects, shifting the conversion objective from physically accurate disparity estimation to artistically consistent disparity generation. Leveraging a dual-path architecture and indirect supervision trained on professional 3D film data, the method enables art-directed depth modeling. A new quantitative metric is introduced to evaluate alignment with cinematic style. Experiments demonstrate that the proposed method successfully reproduces key out-of-screen pop-out effects and achieves high alignment with the global depth aesthetics of professional 3D content, thereby validating the feasibility of art-driven stereoscopic conversion.
π Abstract
Current 2D-to-3D conversion methods achieve geometric accuracy but are artistically deficient, failing to replicate the immersive and emotionally resonant experience of professional 3D cinema. This is because geometric reconstruction paradigms mistake deliberate artistic intent, such as strategic zero-plane shifts for pop-out effects and local depth sculpting, for data noise or ambiguity. This paper argues for a new paradigm: Artistic Disparity Synthesis, shifting the goal from physically accurate disparity estimation to artistically coherent disparity synthesis. We propose Art3D, a preliminary framework exploring this paradigm. Art3D uses a dual-path architecture to decouple global depth parameters (macro-intent) from local artistic effects (visual brushstrokes) and learns from professional 3D film data via indirect supervision. We also introduce a preliminary evaluation method to quantify cinematic alignment. Experiments show our approach demonstrates potential in replicating key local out-of-screen effects and aligning with the global depth styles of cinematic 3D content, laying the groundwork for a new class of artistically-driven conversion tools.