🤖 AI Summary
This work addresses the significant quality degradation often observed in reselected keyframes from Live Photos due to limitations in video ISP pipelines. To tackle this issue, the authors propose LiveMoments, a reference-guided image restoration framework that leverages the original high-quality keyframe to guide the recovery of alternate frames. The method innovatively integrates reference-based guidance with diffusion models and introduces a unified motion alignment mechanism that achieves precise cross-frame alignment simultaneously in both latent and pixel spaces. A dual-branch network architecture enables multi-level transfer of structural and textural features. Extensive experiments demonstrate that LiveMoments substantially outperforms existing approaches on both real-world and synthetic Live Photos datasets, particularly excelling in scenes with fast motion or complex structures by significantly enhancing perceptual quality and fidelity of the reselected frames.
📝 Abstract
Live Photo captures both a high-quality key photo and a short video clip to preserve the precious dynamics around the captured moment. While users may choose alternative frames as the key photo to capture better expressions or timing, these frames often exhibit noticeable quality degradation, as the photo capture ISP pipeline delivers significantly higher image quality than the video pipeline. This quality gap highlights the need for dedicated restoration techniques to enhance the reselected key photo. To this end, we propose LiveMoments, a reference-guided image restoration framework tailored for the reselected key photo in Live Photos. Our method employs a two-branch neural network: a reference branch that extracts structural and textural information from the original high-quality key photo, and a main branch that restores the reselected frame using the guidance provided by the reference branch. Furthermore, we introduce a unified Motion Alignment module that incorporates motion guidance for spatial alignment at both the latent and image levels. Experiments on real and synthetic Live Photos demonstrate that LiveMoments significantly improves perceptual quality and fidelity over existing solutions, especially in scenes with fast motion or complex structures. Our code is available at https://github.com/OpenVeraTeam/LiveMoments.