Beyond the Ground Truth: Enhanced Supervision for Image Restoration

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Supervised learning for real-world image restoration is hindered by the limited quality of ground-truth images in labeled datasets, primarily due to acquisition constraints. To address this, we propose a supervised quality enhancement framework that operates in the frequency domain. Specifically, we design an adaptive frequency mask to semantically align and mix the original ground truth with its super-resolved counterpart, yielding an enhanced ground truth rich in fine details and free from hallucinated artifacts. Furthermore, we introduce a conditional frequency mask generator and a lightweight output optimization network to jointly perform frequency-domain Mixup and model fine-tuning. Extensive experiments demonstrate substantial improvements in reconstruction quality across multiple image restoration models under real-world conditions. A user study further validates the effectiveness of both the supervision enhancement and the output optimization components.

Technology Category

Application Category

📝 Abstract
Deep learning-based image restoration has achieved significant success. However, when addressing real-world degradations, model performance is limited by the quality of ground-truth images in datasets due to practical constraints in data acquisition. To address this limitation, we propose a novel framework that enhances existing ground truth images to provide higher-quality supervision for real-world restoration. Our framework generates perceptually enhanced ground truth images using super-resolution by incorporating adaptive frequency masks, which are learned by a conditional frequency mask generator. These masks guide the optimal fusion of frequency components from the original ground truth and its super-resolved variants, yielding enhanced ground truth images. This frequency-domain mixup preserves the semantic consistency of the original content while selectively enriching perceptual details, preventing hallucinated artifacts that could compromise fidelity. The enhanced ground truth images are used to train a lightweight output refinement network that can be seamlessly integrated with existing restoration models. Extensive experiments demonstrate that our approach consistently improves the quality of restored images. We further validate the effectiveness of both supervision enhancement and output refinement through user studies. Code is available at https://github.com/dhryougit/Beyond-the-Ground-Truth.
Problem

Research questions and friction points this paper is trying to address.

Enhances ground truth images for better supervision
Uses adaptive frequency masks for perceptual detail enrichment
Trains a refinement network to improve restoration models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates enhanced ground truth via adaptive frequency masks
Uses frequency-domain mixup to enrich details without artifacts
Trains lightweight refinement network for existing restoration models
🔎 Similar Papers
No similar papers found.
D
Donghun Ryou
Computer Vision Laboratory, ECE & IPAI, Seoul National University
I
Inju Ha
Computer Vision Laboratory, ECE & IPAI, Seoul National University
S
Sanghyeok Chu
Computer Vision Laboratory, ECE & IPAI, Seoul National University
Bohyung Han
Bohyung Han
Professor, Electrical and Computer Engineering, Seoul National University
Computer visionmachine learningdeep learning