🤖 AI Summary
Moiré patterns in videos arise from spatial aliasing between high-frequency scene content and the camera’s sampling grid, closely resembling authentic textures while degrading tone consistency and temporal coherence. This paper proposes the first focus-defocus dual-camera moiré removal framework: it leverages the inherent moiré suppression capability of defocused video to guide artifact discrimination and removal in sharp video; aligns the two streams via optical flow and reconstructs frames using a multi-scale CNN coupled with multi-dimensional losses; and incorporates joint bilateral filtering to ensure structural fidelity and temporal stability. Experiments demonstrate that our method significantly outperforms state-of-the-art approaches on both image- and video-level moiré removal—effectively eliminating moiré artifacts while better preserving fine texture details and chromatic consistency.
📝 Abstract
Moire patterns, unwanted color artifacts in images and videos, arise from the interference between spatially high-frequency scene contents and the spatial discrete sampling of digital cameras. Existing demoireing methods primarily rely on single-camera image/video processing, which faces two critical challenges: 1) distinguishing moire patterns from visually similar real textures, and 2) preserving tonal consistency and temporal coherence while removing moire artifacts. To address these issues, we propose a dual-camera framework that captures synchronized videos of the same scene: one in focus (retaining high-quality textures but may exhibit moire patterns) and one defocused (with significantly reduced moire patterns but blurred textures). We use the defocused video to help distinguish moire patterns from real texture, so as to guide the demoireing of the focused video. We propose a frame-wise demoireing pipeline, which begins with an optical flow based alignment step to address any discrepancies in displacement and occlusion between the focused and defocused frames. Then, we leverage the aligned defocused frame to guide the demoireing of the focused frame using a multi-scale CNN and a multi-dimensional training loss. To maintain tonal and temporal consistency, our final step involves a joint bilateral filter to leverage the demoireing result from the CNN as the guide to filter the input focused frame to obtain the final output. Experimental results demonstrate that our proposed framework largely outperforms state-of-the-art image and video demoireing methods.