Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification

๐Ÿ“… 2024-10-06
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Vision classification models suffer from poor generalization and lack of interpretability under unknown-class and unknown-domain settings. To address this, we propose Logic-regularized learning (L-Reg), the first framework to establish a theoretical connection between formal logic and visual generalization: it incorporates semantic part-based logical rules (e.g., โ€œface โ†’ personโ€) into the training objective, jointly constraining feature distribution complexity and classifier weight norms to achieve interpretable and robust discrimination. Our method supports both multi-domain joint training and generalized category discovery. We provide theoretical analysis proving that L-Reg effectively reduces model complexity. Empirically, L-Reg achieves significant accuracy gains on multi-domain generalization and generalized category discovery benchmarks; notably, it delivers consistently improved generalization under unknown-class and unknown-domain conditions, simultaneously attaining strong generalization capability and intrinsic interpretability.

Technology Category

Application Category

๐Ÿ“ Abstract
Vision models excel in image classification but struggle to generalize to unseen data, such as classifying images from unseen domains or discovering novel categories. In this paper, we explore the relationship between logical reasoning and deep learning generalization in visual classification. A logical regularization termed L-Reg is derived which bridges a logical analysis framework to image classification. Our work reveals that L-Reg reduces the complexity of the model in terms of the feature distribution and classifier weights. Specifically, we unveil the interpretability brought by L-Reg, as it enables the model to extract the salient features, such as faces to persons, for classification. Theoretical analysis and experiments demonstrate that L-Reg enhances generalization across various scenarios, including multi-domain generalization and generalized category discovery. In complex real-world scenarios where images span unknown classes and unseen domains, L-Reg consistently improves generalization, highlighting its practical efficacy.
Problem

Research questions and friction points this paper is trying to address.

Visual Model
Adaptability
Unseen Categories
Innovation

Methods, ideas, or system contributions that make the work stand out.

L-Reg
Logic Reasoning
Domain Adaptation
Zhaorui Tan
Zhaorui Tan
University of Liverpool, PHD student
GeneralizationText-to-ImageGenerative models
X
Xi Yang
Xiโ€™an-Jiaotong Liverpool University
Q
Qiufeng Wang
Xiโ€™an-Jiaotong Liverpool University
A
Anh Nguyen
University of Liverpool
Kaizhu Huang
Kaizhu Huang
Professor, Duke Kunshan University
Generalization & RobustnessStatistical Learning ThoeryTrustworthy AI