Goal Conditioned Reinforcement Learning for Photo Finishing Tuning

📅 2025-03-10

🏛️ Neural Information Processing Systems

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the inefficiency of manual, black-box, and non-differentiable parameter tuning in photo post-processing pipelines (e.g., Adobe Lightroom), this paper proposes the first target-conditioned reinforcement learning framework that requires no surrogate model. Our method conditions the policy on a target image—such as a reference style exemplar or a pixel-aligned ground-truth—and directly optimizes iterative parameter actions via Proximal Policy Optimization (PPO), driving the native software API in an end-to-end manner. Unlike conventional Bayesian optimization—which demands over 200 queries—our approach converges in approximately 10 queries, achieving state-of-the-art performance in color grading and style transfer while improving query efficiency by 20×. Moreover, it exhibits strong generalization, transferring effectively across unseen styles and device-specific processing pipelines. The source code and an interactive demonstration platform are publicly available.

Technology Category

Application Category

📝 Abstract

Photo finishing tuning aims to automate the manual tuning process of the photo finishing pipeline, like Adobe Lightroom or Darktable. Previous works either use zeroth-order optimization, which is slow when the set of parameters increases, or rely on a differentiable proxy of the target finishing pipeline, which is hard to train. To overcome these challenges, we propose a novel goal-conditioned reinforcement learning framework for efficiently tuning parameters using a goal image as a condition. Unlike previous approaches, our tuning framework does not rely on any proxy and treats the photo finishing pipeline as a black box. Utilizing a trained reinforcement learning policy, it can efficiently find the desired set of parameters within just 10 queries, while optimization based approaches normally take 200 queries. Furthermore, our architecture utilizes a goal image to guide the iterative tuning of pipeline parameters, allowing for flexible conditioning on pixel-aligned target images, style images, or any other visually representable goals. We conduct detailed experiments on photo finishing tuning and photo stylization tuning tasks, demonstrating the advantages of our method. Project website: https://openimaginglab.github.io/RLPixTuner/.

Problem

Research questions and friction points this paper is trying to address.

Automates manual photo finishing pipeline tuning

Overcomes slow optimization and proxy training issues

Uses goal-conditioned RL for efficient parameter tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Goal-conditioned reinforcement learning for photo tuning

Treats photo pipeline as black box, no proxy needed

Efficient parameter tuning in 10 queries

🔎 Similar Papers

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data