Image Translation with Kernel Prediction Networks for Semantic Segmentation

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation in semantic segmentation caused by domain shift between synthetic and real data, this paper proposes the Domain Adversarial Kernel Prediction Network (DA-KPN), a novel unpaired image translation method. Unlike conventional GAN-based frameworks relying on cycle consistency, DA-KPN introduces learnable pixel-wise kernel parameters and employs a lightweight mapping function to generate spatially adaptive transformations, explicitly enforcing pixel-level semantic consistency between translated images and their synthetic labels. Multi-scale discriminators are integrated into an adversarial training scheme to jointly preserve photorealism and enhance semantic alignment. Experiments demonstrate that DA-KPN significantly outperforms state-of-the-art GAN methods on syn-to-real semantic segmentation benchmarks—especially under low real-label supervision—and achieves competitive performance on facial parsing tasks, validating its generalizability and practical utility.

Technology Category

Application Category

📝 Abstract
Semantic segmentation relies on many dense pixel-wise annotations to achieve the best performance, but owing to the difficulty of obtaining accurate annotations for real world data, practitioners train on large-scale synthetic datasets. Unpaired image translation is one method used to address the ensuing domain gap by generating more realistic training data in low-data regimes. Current methods for unpaired image translation train generative adversarial networks (GANs) to perform the translation and enforce pixel-level semantic matching through cycle consistency. These methods do not guarantee that the semantic matching holds, posing a problem for semantic segmentation where performance is sensitive to noisy pixel labels. We propose a novel image translation method, Domain Adversarial Kernel Prediction Network (DA-KPN), that guarantees semantic matching between the synthetic label and translation. DA-KPN estimates pixel-wise input transformation parameters of a lightweight and simple translation function. To ensure the pixel-wise transformation is realistic, DA-KPN uses multi-scale discriminators to distinguish between translated and target samples. We show DA-KPN outperforms previous GAN-based methods on syn2real benchmarks for semantic segmentation with limited access to real image labels and achieves comparable performance on face parsing.
Problem

Research questions and friction points this paper is trying to address.

Address domain gap in semantic segmentation training data
Ensure semantic matching between synthetic and translated images
Improve performance with limited real image labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Domain Adversarial Kernel Prediction Network
Estimates pixel-wise transformation parameters
Employs multi-scale discriminators for realism
🔎 Similar Papers
No similar papers found.