Towards In-Context Tone Style Transfer with A Large-Scale Triplet Dataset

📅 2026-04-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
This work addresses the limitations of existing photo color grading style transfer methods, which suffer from the scarcity of high-quality triplet datasets and semantic loss or color distortion caused by independently extracting content and reference features. To overcome these challenges, the authors construct TST100K—the first large-scale tone transfer triplet dataset comprising 100,000 image triplets—and propose ICTone, a diffusion-based framework that enables context-aware joint conditional modeling for synergistic content-reference feature extraction. Furthermore, they introduce a tone-scoring reward feedback mechanism to refine generation quality. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on both quantitative metrics and human evaluations, validating the effectiveness of the new dataset and the superiority of the proposed framework.

Technology Category

Application Category

📝 Abstract
Tone style transfer for photo retouching aims to adapt the stylistic tone of the reference image to a given content image. However, the lack of high-quality large-scale triplet datasets with stylized ground truth forces existing methods to rely on self-supervised or proxy objectives, which limits model capability. To mitigate this gap, we design a data construction pipeline to build TST100K, a large-scale dataset of 100,000 content-reference-stylized triplets. At the core of this pipeline, we train a tone style scorer to ensure strict stylistic consistency for each triplet. In addition, existing methods typically extract content and reference features independently and then fuse them in a decoder, which may cause semantic loss and lead to inappropriate color transfer and degraded visual aesthetics. Instead, we propose ICTone, a diffusion-based framework that performs tone transfer in an in-context manner by jointly conditioning on both images, leveraging the semantic priors of generative models for semantic-aware transfer. Reward feedback learning using the tone style scorer is further incorporated to improve stylistic fidelity and visual quality. Experiments demonstrate the effectiveness of TST100K, and ICTone achieves state-of-the-art performance on both quantitative metrics and human evaluations.
Problem

Research questions and friction points this paper is trying to address.

tone style transfer
photo retouching
triplet dataset
semantic consistency
color transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

tone style transfer
in-context learning
diffusion model
triplet dataset
reward feedback learning
🔎 Similar Papers
No similar papers found.