🤖 AI Summary
To address the challenges of adapting large language models (LLMs) to low-resource, novel domains and the high cost of full fine-tuning, this paper proposes a parameter-efficient fine-tuning (PEFT)-based cross-domain knowledge transfer framework. It introduces a dual-adapter architecture—comprising domain-intrinsic adapters and cross-domain adapters—to enable knowledge transfer without target-domain labeled data. We systematically evaluate six PEFT methods on fourteen summarization datasets spanning scientific, medical, legal, and news domains, using Llama-3-8B-Instruct as the backbone. Results show that lightweight models equipped with domain-intrinsic adapters significantly outperform both Llama-3-70B-Instruct and few-shot baselines; moreover, cross-domain adapters substantially improve zero-shot summarization quality in low-resource settings. Our analysis reveals that linguistic commonalities across domains critically support effective PEFT-based transfer, establishing a new paradigm for efficiently adapting small-scale LLMs to diverse domain-specific tasks.
📝 Abstract
Large Language Models (LLMs), being generic task solvers, are versatile. However, despite the vast amount of data they are trained on, there are speculations about their adaptation capabilities to a new domain. Additionally, the simple fine-tuning of the model to incorporate knowledge of a new domain is computationally expensive and time-consuming. This becomes more challenging when the domain in question is also low-resource, and labeled data is unavailable. We leverage parameter-efficient fine-tuning techniques (PEFTs) on high-resource datasets to address these challenges to improve performance on unseen low-resource domains. Throughout our experiments, we evaluate whether intrinsic linguistic commonalities between datasets can be leveraged for efficient domain adaptation. We benchmark six PEFTs with exttt{Llama-3-8B-Instruct} on 14 training datasets from the Scientific, Medical, Legal, and News domains for a Text Summarization task. Our experiments show that for low-resource domains, inference using Within-Domain Adapters can achieve better performance than Few-Shot as well as a much larger exttt{Llama-3-70B-Instruct}. Lastly, in the absence of Within-Domain Adapters, we explore the concept of using Cross-Domain Adapters as well as the strategic combinations of adapters to leverage intrinsic language similarities across domains, facilitating better adaptability and performance in low-resource settings.