DeconDTN-Toolkit: A Library for Evaluation and Enhancement of Robustness to Provenance Shift

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This work addresses the problem of provenance shift—performance degradation under out-of-distribution scenarios caused by changes in the relationship between data sources and labels during deployment. It formally establishes, for the first time, the theoretical connection between provenance shift and counterfactual invariance within the framework of invariant learning, and proposes a learning objective tailored for robustness. The core contributions include the development of DeconDTN-Toolkit, the first open-source toolkit enabling simulation and mitigation of provenance shift; the introduction of a novel evaluation metric for out-of-distribution robustness; and systematic experiments that expose the fragility of empirical risk minimization approaches while demonstrating the effectiveness of the proposed strategy in enhancing model robustness.

📝 Abstract

Despite the burgeoning body of work on distribution shifts, provenance shift-where the relationship between data source and label changes at deployment-remains poorly understood and under-addressed. In this paper, we establish a formal connection between provenance shift, counterfactual invariance, and invariant learning to derive a learning objective for robustness. We then introduce \textsc{DeconDTN-Toolkit}, a specialized evaluation and remediation suite designed to simulate provenance shifts of varying degrees while maintaining the training protocol and the infrastructure of existing benchmarks. We reveal the vulnerability of Empirical Risk Minimization under provenance shift, introduce a robust out-of-distribution performance indicator, and conduct a comprehensive evaluation on existing algorithms. Our work provides both the theoretical grounding and the practical tools necessary to characterize the problem of confounding by provenance, and implementations of methods to mitigate it.

Problem

Research questions and friction points this paper is trying to address.

provenance shift

distribution shift

counterfactual invariance

invariant learning

out-of-distribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

provenance shift

counterfactual invariance

invariant learning