SemiNFT: Learning to Transfer Presets from Imitation to Appreciation via Hybrid-Sample Reinforcement Learning

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing color preset transfer methods often neglect semantic context and human aesthetic preferences, relying solely on pixel-level statistics and thus failing to meet the high-quality editing demands of non-professional users. This work proposes SemiNFT, a framework that emulates the human artistic learning process—from imitation to appreciation—by first learning structure-preserving color mappings through paired triplets, then enhancing aesthetic perception via a hybrid online–offline reinforcement learning strategy leveraging unpaired data. To mitigate skill forgetting, the method incorporates a structural consistency constraint. Built upon the Diffusion Transformer architecture, SemiNFT outperforms state-of-the-art approaches on standard benchmarks and demonstrates remarkable generalization and aesthetic understanding in zero-shot cross-domain tasks, such as grayscale-to-color and anime-to-photorealistic image translation.

Technology Category

Application Category

📝 Abstract
Photorealistic color retouching plays a vital role in visual content creation, yet manual retouching remains inaccessible to non-experts due to its reliance on specialized expertise. Reference-based methods offer a promising alternative by transferring the preset color of a reference image to a source image. However, these approaches often operate as novice learners, performing global color mappings derived from pixel-level statistics, without a true understanding of semantic context or human aesthetics. To address this issue, we propose SemiNFT, a Diffusion Transformer (DiT)-based retouching framework that mirrors the trajectory of human artistic training: beginning with rigid imitation and evolving into intuitive creation. Specifically, SemiNFT is first taught with paired triplets to acquire basic structural preservation and color mapping skills, and then advanced to reinforcement learning (RL) on unpaired data to cultivate nuanced aesthetic perception. Crucially, during the RL stage, to prevent catastrophic forgetting of old skills, we design a hybrid online-offline reward mechanism that anchors aesthetic exploration with structural review. % experiments Extensive experiments show that SemiNFT not only outperforms state-of-the-art methods on standard preset transfer benchmarks but also demonstrates remarkable intelligence in zero-shot tasks, such as black-and-white photo colorization and cross-domain (anime-to-photo) preset transfer. These results confirm that SemiNFT transcends simple statistical matching and achieves a sophisticated level of aesthetic comprehension. Our project can be found at https://melanyyang.github.io/SemiNFT/.
Problem

Research questions and friction points this paper is trying to address.

color retouching
preset transfer
aesthetic perception
reference-based image editing
photorealistic colorization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid-Sample Reinforcement Learning
Diffusion Transformer
Preset Transfer
Aesthetic Perception
Catastrophic Forgetting Mitigation
🔎 Similar Papers
No similar papers found.
M
Melany Yang
vivo Mobile Communication Co. Ltd, Zhejiang University
Y
Yuhang Yu
vivo Mobile Communication Co. Ltd
D
Diwang Weng
vivo Mobile Communication Co. Ltd
Jinwei Chen
Jinwei Chen
vivo
computer vision
W
Wei Dong
vivo Mobile Communication Co. Ltd