On the Utility of Domain-Adjacent Fine-Tuned Model Ensembles for Few-shot Problems

๐Ÿ“… 2024-06-19
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

176K/year
๐Ÿค– AI Summary
To address few-shot learning under scarce target-domain labeled data, this paper proposes DAFT-E: a framework that systematically models domain proximity, integrates multiple cross-domain fine-tuned foundation models, and combines few-shot prompting with weighted output aggregation. DAFT-E is the first method to empirically validate the synergistic gain of proximity-aware model ensembling in few-shot settings. Experiments show that DAFT-E achieves near-optimal zero-shot performance relative to the best single model, and substantially outperforms all individual proximal models in few-shot scenariosโ€”attaining high accuracy with only 1โ€“5 labeled examples per class, thereby drastically reducing data dependency for domain adaptation. Its core contribution lies in uncovering and leveraging the complementary strengths across cross-domain models, establishing an efficient, scalable paradigm for model reuse in low-resource settings.

Technology Category

Application Category

๐Ÿ“ Abstract
Large Language Models (LLMs) have been observed to perform well on a wide range of downstream tasks when fine-tuned on domain-specific data. However, such data may not be readily available in many applications, motivating zero-shot or few-shot approaches using domain-adjacent models. While several fine-tuned models for various tasks are available, finding an appropriate domain-adjacent model for a given task is often not straight forward. In this paper, we study DAFT-E, a framework that utilizes an Ensemble of Domain-Adjacent Fine-Tuned Foundation Models for few-shot problems. We show that for zero-shot problems, this ensembling method provides an accuracy performance close to that of the single best model. With few-shot problems, this performance improves further, at which point DEFT-E can outperform any single domain-adjacent model while requiring much less data for domain-specific fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Finding suitable domain-adjacent models for few-shot tasks
Improving accuracy in zero-shot problems using model ensembles
Reducing data needed for domain-specific fine-tuning with ensembles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble of domain-adjacent fine-tuned models
Improves few-shot problem performance
Reduces need for domain-specific data
๐Ÿ”Ž Similar Papers
No similar papers found.