Model-Free Counterfactual Subset Selection at Scale

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address synthetic-data bias, full-dataset dependency, and real-time bottlenecks in counterfactual explanation for streaming data, this paper proposes the first model-agnostic, synthetic-sample-free framework for large-scale streaming counterfactual subset selection. Our method leverages online submodular optimization and dynamic core-set maintenance, integrating a joint diversity–relevance metric—without requiring gradients, generative models, or distributional assumptions. It enables storage-free, real-time updates with O(log k) time complexity per arrival. Extensive experiments on multiple real-world and synthetic streaming datasets demonstrate a 32% improvement in explanation quality (F1-score) over state-of-the-art baselines, while maintaining robustness under adversarial perturbations and efficiently scaling to million-item streams.

Technology Category

Application Category

📝 Abstract
Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical constraint in real-time environments where data flows continuously. In contrast, streaming explanations offer adaptive, real-time insights without requiring persistent storage of the entire dataset. This work introduces a scalable, model-free approach to selecting diverse and relevant counterfactual examples directly from observed data. Our algorithm operates efficiently in streaming settings, maintaining $O(log k)$ update complexity per item while ensuring high-quality counterfactual selection. Empirical evaluations on both real-world and synthetic datasets demonstrate superior performance over baseline methods, with robust behavior even under adversarial conditions.
Problem

Research questions and friction points this paper is trying to address.

model-free counterfactual subset selection
real-time streaming explanations
bias-free interpretable AI decisions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-free counterfactual selection
Streaming data processing
Efficient O(log k) update
🔎 Similar Papers
No similar papers found.
M
Minh Hieu Nguyen
V
Viet Hung Doan
A
Anh Tuan Nguyen
Jun Jo
Jun Jo
Griffith University
satellite data analysismedical data analysisubiquitous roboticse-learning
Q
Quoc Viet Hung Nguyen