Adaptive Data Augmentation with Multi-armed Bandit: Sample-Efficient Embedding Calibration for Implicit Pattern Recognition

📅 2026-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of fine-tuning pretrained foundation models for long-tailed implicit pattern recognition tasks, where scarce annotations and high computational costs hinder effective adaptation. To overcome these limitations, the authors propose ADAMAB, a framework that introduces a lightweight, embedder-agnostic calibrator on top of frozen embedding models and integrates an adaptive data augmentation strategy based on multi-armed bandits (MAB). By employing an improved upper confidence bound algorithm, ADAMAB mitigates gradient bias while providing theoretical convergence guarantees. Experimental results demonstrate that, under extreme few-shot settings with fewer than five initial samples per class, ADAMAB achieves up to a 40% absolute accuracy gain in multimodal scenarios, substantially reducing reliance on large-scale labeled datasets.

Technology Category

Application Category

📝 Abstract
Recognizing implicit visual and textual patterns is essential in many real-world applications of modern AI. However, tackling long-tail pattern recognition tasks remains challenging for current pre-trained foundation models such as LLMs and VLMs. While finetuning pre-trained models can improve accuracy in recognizing implicit patterns, it is usually infeasible due to a lack of training data and high computational overhead. In this paper, we propose ADAMAB, an efficient embedding calibration framework for few-shot pattern recognition. To maximally reduce the computational costs, ADAMAB trains embedder-agnostic light-weight calibrators on top of fixed embedding models without accessing their parameters. To mitigate the need for large-scale training data, we introduce an adaptive data augmentation strategy based on the Multi-Armed Bandit (MAB) mechanism. With a modified upper confidence bound algorithm, ADAMAB diminishes the gradient shifting and offers theoretically guaranteed convergence in few-shot training. Our multi-modal experiments justify the superior performance of ADAMAB, with up to 40% accuracy improvement when training with less than 5 initial data samples of each class.
Problem

Research questions and friction points this paper is trying to address.

implicit pattern recognition
long-tail recognition
few-shot learning
data scarcity
foundation models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Data Augmentation
Multi-armed Bandit
Embedding Calibration
Few-shot Learning
Sample Efficiency
🔎 Similar Papers
No similar papers found.