Content-Adaptive Image Retouching Guided by Attribute-Based Text Representation

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing image retouching methods rely on global pixel-wise color mapping, neglecting semantic content variations and user-specific style preferences—leading to color distortion, regional inconsistency, and inadequate style alignment. To address these limitations, we propose a content-adaptive curve mapping framework integrated with attribute-driven textual representation learning. Our approach introduces a novel multi-basis curve mapping mechanism that enables semantic-region-aware, context-sensitive color adjustment. We further design an attribute text prediction module to generate interpretable, fine-grained style descriptions, and construct a vision-language cross-modal fusion architecture coupled with a learnable weight map estimation module for adaptive spatial modulation. Evaluated on multiple public benchmarks, our method achieves state-of-the-art performance, significantly improving color fidelity, regional consistency, and alignment with user-defined stylistic preferences.

Technology Category

Application Category

📝 Abstract
Image retouching has received significant attention due to its ability to achieve high-quality visual content. Existing approaches mainly rely on uniform pixel-wise color mapping across entire images, neglecting the inherent color variations induced by image content. This limitation hinders existing approaches from achieving adaptive retouching that accommodates both diverse color distributions and user-defined style preferences. To address these challenges, we propose a novel Content-Adaptive image retouching method guided by Attribute-based Text Representation (CA-ATP). Specifically, we propose a content-adaptive curve mapping module, which leverages a series of basis curves to establish multiple color mapping relationships and learns the corresponding weight maps, enabling content-aware color adjustments. The proposed module can capture color diversity within the image content, allowing similar color values to receive distinct transformations based on their spatial context. In addition, we propose an attribute text prediction module that generates text representations from multiple image attributes, which explicitly represent user-defined style preferences. These attribute-based text representations are subsequently integrated with visual features via a multimodal model, providing user-friendly guidance for image retouching. Extensive experiments on several public datasets demonstrate that our method achieves state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Adaptive image retouching for diverse color distributions
Integrating user-defined style preferences via text guidance
Overcoming uniform color mapping limitations with content-aware adjustments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Content-adaptive curve mapping for color adjustments
Attribute-based text representation for style guidance
Multimodal integration of text and visual features
🔎 Similar Papers
No similar papers found.
H
Hancheng Zhu
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, China, and also with the Mine Digitization Engineering Research Center of the Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
X
Xinyu Liu
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, China, and also with the Mine Digitization Engineering Research Center of the Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
R
Rui Yao
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, China, and also with the Mine Digitization Engineering Research Center of the Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
K
Kunyang Sun
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, China, and also with the Mine Digitization Engineering Research Center of the Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
Leida Li
Leida Li
Xidian University, China
Visual quality evaluationComputational aestheticsAffective computing
Abdulmotaleb El Saddik
Abdulmotaleb El Saddik
MCRLab, University of Ottawa
Immersive MediaDigital TwinsHuman Centered AIMultimedia CommunicationMetaverse