SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator

📅 2025-10-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing conditional GAN discriminators struggle to jointly optimize sample fidelity and conditional alignment. To address this, we propose a novel conditional discriminator featuring a decoupled architecture: an unconditional branch evaluates sample authenticity, while a matching-aware branch models conditional consistency; separation via projection-based structures and inductive biases further enhances feature disentanglement. We additionally introduce matching-aware supervision, adaptive loss weighting, and a dual-branch output mechanism to dynamically balance the two objectives during training. Our method achieves significant improvements in FID and CLIP Score on both class-conditional and text-to-image generation benchmarks, consistently outperforming state-of-the-art approaches in both sample quality and conditional alignment. Comprehensive experiments validate its effectiveness, robustness, and generalizability across diverse conditional generation tasks.

Technology Category

Application Category

📝 Abstract
Deep generative models have made significant advances in generating complex content, yet conditional generation remains a fundamental challenge. Existing conditional generative adversarial networks often struggle to balance the dual objectives of assessing authenticity and conditional alignment of input samples within their conditional discriminators. To address this, we propose a novel discriminator design that integrates three key capabilities: unconditional discrimination, matching-aware supervision to enhance alignment sensitivity, and adaptive weighting to dynamically balance all objectives. Specifically, we introduce Sum of Naturalness and Alignment (SONA), which employs separate projections for naturalness (authenticity) and alignment in the final layer with an inductive bias, supported by dedicated objective functions and an adaptive weighting mechanism. Extensive experiments on class-conditional generation tasks show that ours achieves superior sample quality and conditional alignment compared to state-of-the-art methods. Furthermore, we demonstrate its effectiveness in text-to-image generation, confirming the versatility and robustness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Balancing authenticity and alignment in conditional generation
Improving conditional discriminator design with three capabilities
Enhancing sample quality and alignment in generative tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Separate projections for naturalness and alignment
Adaptive weighting mechanism for dynamic balance
Matching-aware supervision enhances alignment sensitivity
🔎 Similar Papers
No similar papers found.