FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback

📅 2024-04-07
🏛️ arXiv.org
📈 Citations: 16
Influential: 2
📄 PDF
🤖 AI Summary
To address hallucinations—specifically object existence, attribute, and relational inconsistencies—arising from vision-language misalignment in large multimodal models, this paper proposes a fine-grained AI-feedback-driven reinforcement learning (RL) alignment framework. Methodologically, it introduces the first paragraph-level AI feedback mechanism capable of explicitly identifying hallucination types; constructs three dedicated reward models to generate dense, fine-grained rewards per hallucination category; and designs a plug-and-play feedback module to enhance PPO optimization. Compared to existing RL-based alignment approaches, our framework achieves superior performance on hallucination benchmarks and general multimodal understanding tasks, with fewer parameters and significantly reduced reliance on human annotations. Key contributions include: (1) hallucination-type-aware fine-grained AI feedback, (2) dense multi-type reward modeling, and (3) a lightweight, efficient PPO variant—establishing a scalable, low-overhead paradigm for cross-modal alignment.

Technology Category

Application Category

📝 Abstract
Large Vision-Language Models (LVLMs) have demonstrated proficiency in tackling a variety of visual-language tasks. However, current LVLMs suffer from misalignment between text and image modalities which causes three kinds of hallucination problems, i.e., object existence, object attribute, and object relationship. To tackle this issue, existing methods mainly utilize Reinforcement Learning (RL) to align modalities in LVLMs. However, they still suffer from three main limitations: (1) General feedback can not indicate the hallucination type contained in the response; (2) Sparse rewards only give the sequence-level reward for the whole response; and (3)Annotation cost is time-consuming and labor-intensive. To handle these limitations, we propose an innovative method to align modalities in LVLMs through Fine-Grained Artificial Intelligence Feedback (FGAIF), which mainly consists of three steps: AI-based Feedback Collection, Fine-grained Reward Model Training, and Reinforcement Learning with Fine-grained Reward. Specifically, We first utilize AI tools to predict the types of hallucination for each segment in the response and obtain a collection of fine-grained feedback. Then, based on the collected reward data, three specialized reward models are trained to produce dense rewards. Finally, a novel fine-grained feedback module is integrated into the Proximal Policy Optimization (PPO) algorithm. Extensive experiments are conducted on hallucination and general benchmarks, demonstrating the superior performance of our proposed method. Notably, compared with previous models trained with the RL-based aligning method, our proposed method is effective even with fewer parameters.
Problem

Research questions and friction points this paper is trying to address.

Misalignment between text and image modalities in LVLMs
Hallucination problems in object existence, attribute, and relationship
Limitations in feedback granularity, reward sparsity, and annotation cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses AI tools for fine-grained feedback collection
Trains three specialized dense reward models
Integrates fine-grained feedback into PPO algorithm
🔎 Similar Papers
No similar papers found.