Automated Detection of Visual Attribute Reliance with a Self-Reflective Agent

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of latent visual attribute dependencies—such as texture, color, or shape—in vision model predictions, which undermine robustness and interpretability. We propose a self-reflective agent framework that automatically identifies implicit feature dependencies via a closed-loop process: hypothesis generation, controlled experimental validation (leveraging causal analysis), and iterative reinforcement learning–based self-evaluation. The framework systematically designs interventions to isolate causal effects of visual attributes while adapting detection strategies based on performance feedback. Evaluated across a benchmark of 130 vision models, our method significantly outperforms non-reflective baselines, uncovering empirically grounded dependency patterns in state-of-the-art models including CLIP and YOLOv8, with marked improvements in detection accuracy. Our core contribution is the first systematic integration of self-reflection into visual dependency diagnosis—enabling adaptive, robust, and interpretable attribution analysis without requiring ground-truth dependency annotations.

Technology Category

Application Category

📝 Abstract
When a vision model performs image recognition, which visual attributes drive its predictions? Detecting unintended reliance on specific visual features is critical for ensuring model robustness, preventing overfitting, and avoiding spurious correlations. We introduce an automated framework for detecting such dependencies in trained vision models. At the core of our method is a self-reflective agent that systematically generates and tests hypotheses about visual attributes that a model may rely on. This process is iterative: the agent refines its hypotheses based on experimental outcomes and uses a self-evaluation protocol to assess whether its findings accurately explain model behavior. When inconsistencies arise, the agent self-reflects over its findings and triggers a new cycle of experimentation. We evaluate our approach on a novel benchmark of 130 models designed to exhibit diverse visual attribute dependencies across 18 categories. Our results show that the agent's performance consistently improves with self-reflection, with a significant performance increase over non-reflective baselines. We further demonstrate that the agent identifies real-world visual attribute dependencies in state-of-the-art models, including CLIP's vision encoder and the YOLOv8 object detector.
Problem

Research questions and friction points this paper is trying to address.

Detects unintended reliance on visual features in vision models
Automates hypothesis testing for model attribute dependencies
Improves detection accuracy through iterative self-reflection cycles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-reflective agent generates and tests attribute hypotheses
Iterative process refines hypotheses through self-evaluation
Automated framework detects visual dependencies in trained models
🔎 Similar Papers
No similar papers found.
Christy Li
Christy Li
MIT CSAIL
J
Josep Lopez Camunas
Universitat Oberta de Catalunya
J
Jake Thomas Touchet
Louisiana Tech
Jacob Andreas
Jacob Andreas
MIT
NLPML
A
Agata Lapedriza
Universitat Oberta de Catalunya, Northeastern University
Antonio Torralba
Antonio Torralba
Professor of Computer Science, MIT
visioncomputer vision
Tamar Rott Shaham
Tamar Rott Shaham
Postdoctoral fellow, MIT
Machine LearningMachine PerceptionLarge Language ModelsComputer Vision