OARS: Process-Aware Online Alignment for Generative Real-World Image Super-Resolution

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the challenge in generative real-world image super-resolution of balancing perceptual quality and fidelity while aligning with human visual preferences under unknown degradations. To this end, we propose OARS, a novel framework that introduces the first multimodal large language model (MLLM)-based COMPASS reward mechanism. OARS incorporates an input-quality-adaptive strategy for perceptual-fidelity trade-offs and employs a three-stage fine-grained annotation pipeline coupled with an online alignment training paradigm. Through progressive reinforcement learning—from full-reference to no-reference settings—and shallow LoRA optimization, OARS achieves state-of-the-art performance on the Real-ISR benchmark. Extensive user studies and experiments demonstrate that OARS significantly enhances perceptual quality while preserving high fidelity.

Technology Category

Application Category

📝 Abstract

Aligning generative real-world image super-resolution models with human visual preference is challenging due to the perception--fidelity trade-off and diverse, unknown degradations. Prior approaches rely on offline preference optimization and static metric aggregation, which are often non-interpretable and prone to pseudo-diversity under strong conditioning. We propose OARS, a process-aware online alignment framework built on COMPASS, a MLLM-based reward that evaluates the LR to SR transition by jointly modeling fidelity preservation and perceptual gain with an input-quality-adaptive trade-off. To train COMPASS, we curate COMPASS-20K spanning synthetic and real degradations, and introduce a three-stage perceptual annotation pipeline that yields calibrated, fine-grained training labels. Guided by COMPASS, OARS performs progressive online alignment from cold-start flow matching to full-reference and finally reference-free RL via shallow LoRA optimization for on-policy exploration. Extensive experiments and user studies demonstrate consistent perceptual improvements while maintaining fidelity, achieving state-of-the-art performance on Real-ISR benchmarks.

Problem

Research questions and friction points this paper is trying to address.

image super-resolution

human visual preference

perception-fidelity trade-off

real-world degradations

generative modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

process-aware online alignment

generative image super-resolution

perception-fidelity trade-off