🤖 AI Summary
To address few-shot learning under scarce target-domain labeled data, this paper proposes DAFT-E: a framework that systematically models domain proximity, integrates multiple cross-domain fine-tuned foundation models, and combines few-shot prompting with weighted output aggregation. DAFT-E is the first method to empirically validate the synergistic gain of proximity-aware model ensembling in few-shot settings. Experiments show that DAFT-E achieves near-optimal zero-shot performance relative to the best single model, and substantially outperforms all individual proximal models in few-shot scenarios—attaining high accuracy with only 1–5 labeled examples per class, thereby drastically reducing data dependency for domain adaptation. Its core contribution lies in uncovering and leveraging the complementary strengths across cross-domain models, establishing an efficient, scalable paradigm for model reuse in low-resource settings.
📝 Abstract
Large Language Models (LLMs) have been observed to perform well on a wide range of downstream tasks when fine-tuned on domain-specific data. However, such data may not be readily available in many applications, motivating zero-shot or few-shot approaches using domain-adjacent models. While several fine-tuned models for various tasks are available, finding an appropriate domain-adjacent model for a given task is often not straight forward. In this paper, we study DAFT-E, a framework that utilizes an Ensemble of Domain-Adjacent Fine-Tuned Foundation Models for few-shot problems. We show that for zero-shot problems, this ensembling method provides an accuracy performance close to that of the single best model. With few-shot problems, this performance improves further, at which point DEFT-E can outperform any single domain-adjacent model while requiring much less data for domain-specific fine-tuning.