SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation

📅 2024-07-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of fair and realistic evaluation frameworks for unsupervised domain adaptation (UDA) in multimodal settings. We propose the first cross-modal UDA benchmark framework, covering image, text, biomedical, and tabular data. To mitigate evaluation bias induced by hyperparameter selection under unsupervised conditions, we introduce a novel nested cross-validation paradigm coupled with multiple unsupervised model selection criteria (e.g., MMD, SVD stability). We systematically evaluate the generalization capacity of shallow UDA algorithms on both controlled synthetic shifts and real-world datasets. The framework features a scikit-learn–compatible, modular architecture enabling plug-and-play integration of new methods, datasets, or selection criteria. Experiments demonstrate that hyperparameter strategies significantly impact UDA performance; we release reproducible rankings of algorithms across modalities and shift types. The code is publicly available to advance reproducible UDA research.

Technology Category

Application Category

📝 Abstract
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. While many methods have been proposed in the literature, fair and realistic evaluation remains an open question, particularly due to methodological difficulties in selecting hyperparameters in the unsupervised setting. With SKADA-bench, we propose a framework to evaluate DA methods on diverse modalities, beyond computer vision task that have been largely explored in the literature. We present a complete and fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment. Realistic hyperparameter selection is performed with nested cross-validation and various unsupervised model selection scores, on both simulated datasets with controlled shifts and real-world datasets across diverse modalities, such as images, text, biomedical, and tabular data. Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications, with key insights into the choice and impact of model selection approaches. SKADA-bench is open-source, reproducible, and can be easily extended with novel DA methods, datasets, and model selection criteria without requiring re-evaluating competitors. SKADA-bench is available on Github at https://github.com/scikit-adaptation/skada-bench.
Problem

Research questions and friction points this paper is trying to address.

Evaluating unsupervised domain adaptation methods
Realistic hyperparameter selection challenges
Diverse modalities beyond computer vision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised Domain Adaptation evaluation
Nested cross-validation hyperparameter selection
Diverse modalities benchmark framework
🔎 Similar Papers
No similar papers found.