On the Utility of Domain-Adjacent Fine-Tuned Model Ensembles for Few-shot Problems

📅 2024-06-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

To address few-shot learning under scarce target-domain labeled data, this paper proposes DAFT-E: a framework that systematically models domain proximity, integrates multiple cross-domain fine-tuned foundation models, and combines few-shot prompting with weighted output aggregation. DAFT-E is the first method to empirically validate the synergistic gain of proximity-aware model ensembling in few-shot settings. Experiments show that DAFT-E achieves near-optimal zero-shot performance relative to the best single model, and substantially outperforms all individual proximal models in few-shot scenarios—attaining high accuracy with only 1–5 labeled examples per class, thereby drastically reducing data dependency for domain adaptation. Its core contribution lies in uncovering and leveraging the complementary strengths across cross-domain models, establishing an efficient, scalable paradigm for model reuse in low-resource settings.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have been observed to perform well on a wide range of downstream tasks when fine-tuned on domain-specific data. However, such data may not be readily available in many applications, motivating zero-shot or few-shot approaches using domain-adjacent models. While several fine-tuned models for various tasks are available, finding an appropriate domain-adjacent model for a given task is often not straight forward. In this paper, we study DAFT-E, a framework that utilizes an Ensemble of Domain-Adjacent Fine-Tuned Foundation Models for few-shot problems. We show that for zero-shot problems, this ensembling method provides an accuracy performance close to that of the single best model. With few-shot problems, this performance improves further, at which point DEFT-E can outperform any single domain-adjacent model while requiring much less data for domain-specific fine-tuning.

Problem

Research questions and friction points this paper is trying to address.

Finding suitable domain-adjacent models for few-shot tasks

Improving accuracy in zero-shot problems using model ensembles

Reducing data needed for domain-specific fine-tuning with ensembles

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble of domain-adjacent fine-tuned models

Improves few-shot problem performance

Reduces need for domain-specific data

🔎 Similar Papers

No similar papers found.