🤖 AI Summary
Existing face-swapping methods struggle to balance real-time performance (only 33 FPS) and visual fidelity, hindering deployment in interactive applications such as live streaming and video conferencing. This paper introduces the first NeRF-based face-swapping framework capable of real-time inference (≥33 FPS), integrating dynamic facial modeling, a lightweight neural rendering pipeline, and a real-time pose-driven mechanism for end-to-end efficiency. The method achieves high-fidelity geometric and appearance reconstruction while reducing end-to-end latency to just 30 ms and supporting 1080p output. Extensive experiments demonstrate that the framework delivers low latency, high stability, and strong generalization under realistic interactive conditions. It significantly enhances both the real-time responsiveness and visual quality of digital avatars, exhibiting strong feasibility for practical engineering deployment.
📝 Abstract
Face replacement technology enables significant advancements in entertainment, education, and communication applications, including dubbing, virtual avatars, and cross-cultural content adaptation. Our LiveNeRF framework addresses critical limitations of existing methods by achieving real-time performance (33 FPS) with superior visual quality, enabling practical deployment in live streaming, video conferencing, and interactive media. The technology particularly benefits content creators, educators, and individuals with speech impairments through accessible avatar communication. While acknowledging potential misuse in unauthorized deepfake creation, we advocate for responsible deployment with user consent verification and integration with detection systems to ensure positive societal impact while minimizing risks.