Predictive Reasoning with Augmented Anomaly Contrastive Learning for Compositional Visual Relations

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses the challenges of anomaly detection and relational modeling in compositional visual reasoning (CVR), where complex rules and scarce samples hinder effective learning. To tackle these issues, the authors propose a novel prediction-verification framework that predicts the features of a fourth image from three input images and incorporates a Prediction Anomaly Reasoning Block (PARB) for iterative inference. The approach further integrates enhanced anomaly-aware contrastive learning to extract discriminative features. Notably, this is the first study to combine prediction-verification mechanisms with contrastive learning for CVR tasks, significantly improving both model generalization and interpretability. Extensive experiments on the SVRT, CVR, and MC²R datasets demonstrate substantial performance gains over current state-of-the-art methods, underscoring the framework’s effectiveness in complex visual reasoning scenarios.

Technology Category

Application Category

📝 Abstract

While visual reasoning for simple analogies has received significant attention, compositional visual relations (CVR) remain relatively unexplored due to their greater complexity. To solve CVR tasks, we propose Predictive Reasoning with Augmented Anomaly Contrastive Learning (PR-A$^2$CL), \ie, to identify an outlier image given three other images that follow the same compositional rules. To address the challenge of modelling abundant compositional rules, an Augmented Anomaly Contrastive Learning is designed to distil discriminative and generalizable features by maximizing similarity among normal instances while minimizing similarity between normal and anomalous outliers. More importantly, a predict-and-verify paradigm is introduced for rule-based reasoning, in which a series of Predictive Anomaly Reasoning Blocks (PARBs) iteratively leverage features from three out of the four images to predict those of the remaining one. Throughout the subsequent verification stage, the PARBs progressively pinpoint the specific discrepancies attributable to the underlying rules. Experimental results on SVRT, CVR and MC$^2$R datasets show that PR-A$^2$CL significantly outperforms state-of-the-art reasoning models.

Problem

Research questions and friction points this paper is trying to address.

Compositional Visual Relations

Anomaly Detection

Visual Reasoning

Outlier Identification

Rule-based Reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compositional Visual Relations

Anomaly Contrastive Learning

Predictive Reasoning