UniSER: A Foundation Model for Unified Soft Effects Removal

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Digital images are frequently degraded by “soft effects” such as lens flare, haze, shadows, and reflections—commonly characterized by semi-transparent occlusions. Existing approaches either employ isolated, task-specific models or rely heavily on user prompts with limited fidelity in general-purpose image editing frameworks. This paper proposes the first unified framework that models all soft effects as quantifiable semi-transparent occlusions. We construct a large-scale paired dataset (3.8M synthetic samples) grounded in physically plausible degradation modeling, introduce fine-grained occlusion masks and intensity control mechanisms, and realize end-to-end restoration via a diffusion-based Transformer architecture. Leveraging physics-informed data synthesis and targeted fine-tuning, our method significantly outperforms both specialized and general-purpose editing systems on real-world images. It enables robust, high-fidelity removal of both single and mixed soft degradations—achieving, for the first time, a principled unification of generality and physical consistency in soft-effect restoration.

Technology Category

Application Category

📝 Abstract

Digital images are often degraded by soft effects such as lens flare, haze, shadows, and reflections, which reduce aesthetics even though the underlying pixels remain partially visible. The prevailing works address these degradations in isolation, developing highly specialized, specialist models that lack scalability and fail to exploit the shared underlying essences of these restoration problems. While specialist models are limited, recent large-scale pretrained generalist models offer powerful, text-driven image editing capabilities. while recent general-purpose systems (e.g., GPT-4o, Flux Kontext, Nano Banana) require detailed prompts and often fail to achieve robust removal on these fine-grained tasks or preserve identity of the scene. Leveraging the common essence of soft effects, i.e., semi-transparent occlusions, we introduce a foundational versatile model UniSER, capable of addressing diverse degradations caused by soft effects within a single framework. Our methodology centers on curating a massive 3.8M-pair dataset to ensure robustness and generalization, which includes novel, physically-plausible data to fill critical gaps in public benchmarks, and a tailored training pipeline that fine-tunes a Diffusion Transformer to learn robust restoration priors from this diverse data, integrating fine-grained mask and strength controls. This synergistic approach allows UniSER to significantly outperform both specialist and generalist models, achieving robust, high-fidelity restoration in the wild.

Problem

Research questions and friction points this paper is trying to address.

Unified removal of diverse soft effects like flare and haze

Overcoming limitations of specialized single-effect restoration models

Addressing generalist models' failure in robust fine-grained restoration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified model removes diverse soft effects simultaneously

Massive dataset enables robust generalization across degradations

Fine-tuned Diffusion Transformer with mask and strength controls

🔎 Similar Papers

SoftShadow: Leveraging Penumbra-Aware Soft Masks for Shadow Removal