Positive-First Most Ambiguous: A Simple Active Learning Criterion for Interactive Retrieval of Rare Categories

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently discovering rare fine-grained visual categories under limited annotation budgets and severe class imbalance, where conventional active learning methods often struggle. The authors propose a novel active learning strategy tailored for interactive retrieval that explicitly tackles class imbalance by synergistically integrating a positive-sample-first principle with maximum uncertainty sampling. To enhance visual diversity among selected positive samples, they introduce a class coverage metric that guides the selection process. Built upon a relevance feedback framework, the approach combines a lightweight classifier with a customized sampling strategy. Experiments on multiple long-tailed datasets—including fine-grained plant images—demonstrate significant improvements over existing baselines, achieving high recall of relevant samples early in the annotation process and superior classification performance overall.

Technology Category

Application Category

📝 Abstract
Real-world fine-grained visual retrieval often requires discovering a rare concept from large unlabeled collections with minimal supervision. This is especially critical in biodiversity monitoring, ecological studies, and long-tailed visual domains, where the target may represent only a tiny fraction of the data, creating highly imbalanced binary problems. Interactive retrieval with relevance feedback offers a practical solution: starting from a small query, the system selects candidates for binary user annotation and iteratively refines a lightweight classifier. While Active Learning (AL) is commonly used to guide selection, conventional AL assumes symmetric class priors and large annotation budgets, limiting effectiveness in imbalanced, low-budget, low-latency settings. We introduce Positive-First Most Ambiguous (PF-MA), a simple yet effective AL criterion that explicitly addresses the class imbalance asymmetry: it prioritizes near-boundary samples while favoring likely positives, enabling rapid discovery of subtle visual categories while maintaining informativeness. Unlike standard methods that oversample negatives, PF-MA consistently returns small batches with a high proportion of relevant samples, improving early retrieval and user satisfaction. To capture retrieval diversity, we also propose a class coverage metric that measures how well selected positives span the visual variability of the target class. Experiments on long-tailed datasets, including fine-grained botanical data, demonstrate that PF-MA consistently outperforms strong baselines in both coverage and classifier performance, across varying class sizes and descriptors. Our results highlight that aligning AL with the asymmetric and user-centric objectives of interactive fine-grained retrieval enables simple yet powerful solutions for retrieving rare and visually subtle categories in realistic human-in-the-loop settings.
Problem

Research questions and friction points this paper is trying to address.

rare category retrieval
class imbalance
active learning
interactive retrieval
fine-grained visual recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Learning
Class Imbalance
Rare Category Retrieval
Interactive Retrieval
Positive-First Most Ambiguous
🔎 Similar Papers
No similar papers found.
K
Kawtar Zaher
INRIA, LIRMM, Université de Montpellier, France; Institut National de l’Audiovisuel, France
O
Olivier Buisson
Institut National de l’Audiovisuel, France
Alexis Joly
Alexis Joly
Research Director, Inria, Montpellier University, LIRMM
machine learningbiodiversityinformation retrievalplant identification