EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

In subjective annotation tasks, pairwise comparisons incur high labeling costs (O(n²)) and suffer from reliability–efficiency trade-offs. To address this, we propose a zero-shot CLIP-driven human-in-the-loop ranking framework. Our method comprises three key components: (1) hierarchical zero-shot pre-ranking using CLIP to automatically resolve easily distinguishable samples; (2) bucket-aware Elo initialization combined with uncertainty-guided active sampling to prioritize high-information comparisons; and (3) a human-in-the-loop merge-sort algorithm that dynamically integrates model predictions with human feedback. Extensive evaluation across multiple datasets demonstrates that our approach reduces human annotation effort by 90.5% compared to exhaustive pairwise comparison, and further cuts labeling cost by 19.8% relative to the state-of-the-art (n = 100), while maintaining or improving rating consistency.

Technology Category

Application Category

📝 Abstract

Pairwise comparison is often favored over absolute rating or ordinal classification in subjective or difficult annotation tasks due to its improved reliability. However, exhaustive comparisons require a massive number of annotations (O(n^2)). Recent work has greatly reduced the annotation burden (O(n log n)) by actively sampling pairwise comparisons using a sorting algorithm. We further improve annotation efficiency by (1) roughly pre-ordering items using the Contrastive Language-Image Pre-training (CLIP) model hierarchically without training, and (2) replacing easy, obvious human comparisons with automated comparisons. The proposed EZ-Sort first produces a CLIP-based zero-shot pre-ordering, then initializes bucket-aware Elo scores, and finally runs an uncertainty-guided human-in-the-loop MergeSort. Validation was conducted using various datasets: face-age estimation (FGNET), historical image chronology (DHCI), and retinal image quality assessment (EyePACS). It showed that EZ-Sort reduced human annotation cost by 90.5% compared to exhaustive pairwise comparisons and by 19.8% compared to prior work (when n = 100), while improving or maintaining inter-rater reliability. These results demonstrate that combining CLIP-based priors with uncertainty-aware sampling yields an efficient and scalable solution for pairwise ranking.

Problem

Research questions and friction points this paper is trying to address.

Reducing human annotation cost in pairwise comparisons

Improving efficiency of subjective ranking tasks

Automating easy comparisons using zero-shot CLIP model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot CLIP pre-ordering without training

Automated comparisons replace obvious human judgments

Uncertainty-guided human-in-the-loop MergeSort algorithm

🔎 Similar Papers

No similar papers found.