๐ค AI Summary
Existing approaches to abstract visual reasoning struggle to simultaneously model global structural patterns and local inter-row relationships, and often lack effective constraints on intermediate features, leading to entangled rule representations and limited generalization. This work proposes a dual-path reasoning mechanism that integrates row-wise analogical local reasoning with holistic global structural reasoning, dynamically fused through gated attention. Furthermore, a rule-based contrastive learning framework is introduced, which leverages pseudo-labels to construct positive and negative sample pairs, thereby enhancing rule abstraction and representation disentanglement. By jointly optimizing both local and global reasoning pathwaysโthe first method to do soโthe proposed approach significantly improves reasoning robustness and cross-domain generalization across three RAVEN benchmarks.
๐ Abstract
Abstract visual reasoning remains challenging as existing methods often prioritize either global context or local row-wise relations, failing to integrate both, and lack intermediate feature constraints, leading to incomplete rule capture and entangled representations. To address these issues, we propose the Dual-Inference Rule-Contrastive Reasoning (DIRCR) model. Its core component, the Dual-Inference Reasoning Module, combines a local path for row-wise analogical reasoning and a global path for holistic inference, integrated via a gated attention mechanism. Additionally, a Rule-Contrastive Learning Module introduces pseudo-labels to construct positive and negative rule samples, applying contrastive learning to enhance feature separability and promote abstract, transferable rule learning. Experimental results on three RAVEN datasets demonstrate that DIRCR significantly enhances reasoning robustness and generalization. Codes are available at https://github.com/csZack-Zhang/DIRCR.