🤖 AI Summary
Existing interpretability methods for vision models often rely on white-box access or lack quantitative rigor. This work proposes a model-agnostic interpretability framework that identifies critical visual regions—termed visual focuses—underlying predictions and translates them into compact logical expressions, thereby offering structured and transparent explanations of model decisions. By uniquely integrating logical rules with visual focus analysis, the approach operates without requiring internal model access and introduces novel evaluation metrics, including focus precision, recall, and divergence. Experiments demonstrate that as models train, their attention becomes more concentrated and generalization improves, leading to higher focus accuracy; furthermore, the method reveals anomalous focusing behaviors under data bias and adversarial attacks.
📝 Abstract
Interpretability of modern visual models is crucial, particularly in high-stakes applications. However, existing interpretability methods typically suffer from either reliance on white-box model access or insufficient quantitative rigor. To address these limitations, we introduce FocaLogic, a novel model-agnostic framework designed to interpret and quantify visual model decision-making through logic-based representations. FocaLogic identifies minimal interpretable subsets of visual regions-termed visual focuses-that decisively influence model predictions. It translates these visual focuses into precise and compact logical expressions, enabling transparent and structured interpretations. Additionally, we propose a suite of quantitative metrics, including focus precision, recall, and divergence, to objectively evaluate model behavior across diverse scenarios. Empirical analyses demonstrate FocaLogic's capability to uncover critical insights such as training-induced concentration, increasing focus accuracy through generalization, and anomalous focuses under biases and adversarial attacks. Overall, FocaLogic provides a systematic, scalable, and quantitative solution for interpreting visual models.