AI for the Routine, Humans for the Complex: Accuracy-Driven Data Labelling with Mixed Integer Linear Programming

📅 2025-07-07

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Deep learning model testing requires near-perfect ground-truth labels, yet high-accuracy manual annotation is prohibitively expensive. To address this, we propose OPAL—a novel human-in-the-loop annotation framework that formulates the labeling optimization problem as a mixed-integer linear program (MILP), enforcing user-specified accuracy as a hard constraint while minimizing human annotation effort. OPAL synergistically integrates semi-supervised learning and active learning to further reduce annotation demand. Extensive evaluation across seven benchmark datasets shows that OPAL achieves an average labeling accuracy of 98.8% while reducing human annotation effort by 52.3%. In visual system test input validation, it outperforms state-of-the-art methods by saving 28.8% human effort and improving accuracy by 4.5%; incorporating active learning yields an additional 4.5% reduction in annotation cost. Our key contribution is the first automated annotation optimization framework that guarantees provable accuracy and provably minimal annotation cost.

Technology Category

Application Category

📝 Abstract

The scarcity of accurately labelled data remains a major challenge in deep learning (DL). Many DL approaches rely on semi-supervised methods, which focus on constructing large datasets that require only a minimal amount of human-labelled data. Since DL training algorithms can tolerate moderate label noise, it has generally been acceptable for the accuracy of labels in large training datasets to fall well short of a perfect 100%. However, when it comes to testing DL models, achieving high label accuracy-as close to 100% as possible-is paramount for reliable verification. In this article, we introduce OPAL, a human-assisted labelling method that can be configured to target a desired accuracy level while minimizing the manual effort required for labelling. The main contribution of OPAL is a mixed-integer linear programming (MILP) formulation that minimizes labelling effort subject to a specified accuracy target. We evaluate OPAL for two tasks in the context of testing vision systems: automatic labelling of test data and automated validation of test data. Our evaluation, based on more than 2500 experiments performed on seven datasets, comparing OPAL with eight baseline methods, shows that OPAL, relying on its MILP formulation, achieves an average accuracy of 98.8%, just 1.2% below perfect accuracy, while cutting manual labelling by more than half. Further, OPAL significantly outperforms automated labelling baselines in labelling accuracy across all seven datasets, with large effect sizes, when all methods are provided with the same manual-labelling budget. For automated test-input validation, on average, OPAL reduces manual effort by 28.8% while achieving 4.5% higher accuracy than the SOTA validation baselines. Finally, we show that augmenting OPAL with an active learning loop leads to an additional 4.5% reduction in required manual labelling, without compromising accuracy.

Problem

Research questions and friction points this paper is trying to address.

Minimizes manual effort for high-accuracy data labeling

Targets near-perfect label accuracy for DL testing

Optimizes labeling using MILP with accuracy constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses MILP to minimize manual labeling effort

Targets high accuracy with human-assisted labeling

Integrates active learning for further efficiency

🔎 Similar Papers

No similar papers found.