๐ค AI Summary
Pretrained perception models suffer from performance degradation under distribution shifts, while existing metacognitive approaches rely on logical rules for error filteringโoften encountering an inherent precision-recall trade-off. This paper proposes a consistency-driven abductive reasoning framework that, for the first time, formalizes multi-model error management as a constrained abductive inference problem. We introduce an adjustable inconsistency-rate threshold mechanism to dynamically select prediction subsets with high coverage and low logical conflict, thereby overcoming the recall bottleneck of conventional logic-based filtering. Our method integrates logic programming, integer programming, and heuristic search to enable inconsistency detection and multi-model prediction fusion under domain-specific constraints. Evaluated on a challenging aerial imagery benchmark comprising 15 distribution-shift scenarios, our approach achieves +13.6% F1-score and +16.6% accuracy over the best single model, significantly outperforming state-of-the-art ensemble baselines.
๐ Abstract
The deployment of pre-trained perception models in novel environments often leads to performance degradation due to distributional shifts. Although recent artificial intelligence approaches for metacognition use logical rules to characterize and filter model errors, improving precision often comes at the cost of reduced recall. This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction. We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem. The input predictions and the learned error detection rules derived from each model are encoded in a logic program. We then seek an abductive explanation--a subset of model predictions--that maximizes prediction coverage while ensuring the rate of logical inconsistencies (derived from domain constraints) remains below a specified threshold. We propose two algorithms for this knowledge representation task: an exact method based on Integer Programming (IP) and an efficient Heuristic Search (HS). Through extensive experiments on a simulated aerial imagery dataset featuring controlled, complex distributional shifts, we demonstrate that our abduction-based framework outperforms individual models and standard ensemble baselines, achieving, for instance, average relative improvements of approximately 13.6% in F1-score and 16.6% in accuracy across 15 diverse test datasets when compared to the best individual model. Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect reasoners in challenging, novel scenarios.