Saliency-R1: Enforcing Interpretable and Faithful Vision-language Reasoning via Saliency-map Alignment Reward

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the tendency of vision-language models to over-rely on textual priors while neglecting visual evidence, which often leads to unfaithful reasoning and hallucinated outputs. To mitigate this issue, the authors propose a saliency map alignment reward mechanism that incurs no additional computational overhead, marking the first integration of visual saliency with reinforcement learning in this domain. By tracking the flow of visual information during reasoning and using the overlap between generated saliency maps and human-annotated regions as a reward signal, the method steers the model to attend to semantically relevant image areas. Built upon saliency map generation, Group Relative Policy Optimization (GRPO), and a vision-language alignment reward formulation, the approach substantially enhances model faithfulness, interpretability, and performance on downstream tasks.

Technology Category

Application Category

📝 Abstract

Vision-language models (VLMs) have achieved remarkable success across diverse tasks. However, concerns about their trustworthiness persist, particularly regarding tendencies to lean more on textual cues than visual evidence and the risk of producing ungrounded or fabricated responses. To address these issues, we propose Saliency-R1, a framework for improving the interpretability and faithfulness of VLMs reasoning. Specifically, we introduce a novel saliency map technique that efficiently highlights critical image regions contributing to generated tokens without additional computational overhead. This can further be extended to trace how visual information flows through the reasoning process to the final answers, revealing the alignment between the thinking process and the visual context. We use the overlap between the saliency maps and human-annotated bounding boxes as the reward function, and apply Group Relative Policy Optimization (GRPO) to align the salient parts and critical regions, encouraging models to focus on relevant areas when conduct reasoning. Experiments show Saliency-R1 improves reasoning faithfulness, interpretability, and overall task performance.

Problem

Research questions and friction points this paper is trying to address.

vision-language models

faithfulness

interpretability

visual grounding

saliency

Innovation

Methods, ideas, or system contributions that make the work stand out.

saliency-map alignment

vision-language reasoning

faithfulness