MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models

📅 2025-12-05

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

To address shallow semantic modeling, weak logical robustness, and susceptibility to misleading rationales in multimodal large language models (MLLMs) during complex reasoning, this paper proposes the MIND reasoning framework—the first to instantiate a three-level cognitive mechanism: “understand → reflect → revise.” Methodologically, it introduces: (1) rationale-augmented discriminative reasoning; (2) progressive two-stage correction learning (P2CL); and (3) multi-causal contrastive alignment (MCA), which jointly enables correct semantic aggregation and erroneous boundary separation. MIND unifies rationale generation, discriminative training, and multi-causal modeling within a single architecture. Evaluated on diverse public benchmarks spanning scientific, commonsense, and mathematical reasoning, it achieves state-of-the-art performance. Crucially, MIND significantly enhances MLLMs’ logical robustness against adversarial rationales and their capacity for multi-causal inference.

Technology Category

Application Category

📝 Abstract

Recently, multimodal large language models (MLLMs) have been widely applied to reasoning tasks. However, they suffer from limited multi-rationale semantic modeling, insufficient logical robustness, and are susceptible to misleading interpretations in complex scenarios. Therefore, we propose a Multi-rationale INtegrated Discriminative (MIND) reasoning framework, which is designed to endow MLLMs with human-like cognitive abilities of "Understand -> Rethink -> Correct", and achieves a paradigm evolution from passive imitation-based reasoning to active discriminative reasoning. Specifically, we introduce a Rationale Augmentation and Discrimination (RAD) paradigm, which automatically and efficiently expands existing datasets by generating diverse rationales, providing a unified and extensible data foundation. Meanwhile, we design a Progressive Two-stage Correction Learning (P2CL) strategy. The first phase enhances multi-rationale positive learning, while the second phase enables active logic discrimination and correction. In addition, to mitigate representation entanglement in the multi-rationale semantic space, we propose a Multi-rationale Contrastive Alignment (MCA) optimization strategy, which achieves semantic aggregation of correct reasoning and boundary separation of incorrect reasoning. Extensive experiments demonstrate that the proposed MIND reasoning framework achieves state-of-the-art (SOTA) performance on multiple public datasets covering scientific, commonsense, and mathematical scenarios. It provides a new perspective for advancing MLLMs towards higher levels of cognitive intelligence. Our code is available at https://github.com/YuChuang1205/MIND

Problem

Research questions and friction points this paper is trying to address.

Enhances multi-rationale semantic modeling in MLLMs

Improves logical robustness against misleading interpretations

Enables active discriminative reasoning over passive imitation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates diverse rationales to augment datasets automatically

Uses two-stage learning for positive reinforcement and error correction

Aligns correct reasoning and separates incorrect reasoning semantically

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting