A Parameter-Efficient Transfer Learning Approach through Multitask Prompt Distillation and Decomposition for Clinical NLP

📅 2026-04-07

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work addresses the high computational and storage overhead incurred by existing clinical NLP systems when deploying multiple tasks, which typically learn task-specific prompts independently. The authors propose a novel multi-task prompt distillation and decomposition framework that, for the first time, enables a unified prompt representation across diverse clinical NLP tasks. By distilling a shared meta-prompt from 21 source tasks, the method efficiently adapts to unseen target tasks using fewer than 0.05% trainable parameters. Evaluated on backbone models including LLaMA-3.1 8B, Meditron3 8B, and gpt-oss 20B, the approach significantly outperforms LoRA (by +1.5–1.7%) and single-task prompt tuning (by +6.1–6.6%) across 10 target datasets. Notably, gpt-oss 20B achieves the strongest performance on clinical reasoning tasks while supporting efficient zero-shot and few-shot transfer.

Technology Category

Application Category

📝 Abstract

Existing prompt-based fine-tuning methods typically learn task-specific prompts independently, imposing significant computing and storage overhead at scale when deploying multiple clinical natural language processing (NLP) systems. We present a multitask prompt distillation and decomposition framework that learns a single shared metaprompt from 21 diverse clinical source tasks and adapts it to unseen target tasks with fewer than 0.05% trainable parameters. Evaluated across five clinical NLP task types (named entity recognition, relation extraction, question answering, natural language inference, and summarization) on 10 held-out target datasets using three backbone models (LLaMA 3.1 8B, Meditron3 8B, gpt-oss 20B), our framework consistently outperforms LoRA by 1.5~1.7% despite using orders of magnitude fewer parameters, and exceeds single-task prompt tuning by 6.1~6.6%. The gpt-oss 20B model achieves the highest overall performance, particularly on clinical reasoning tasks. The strong zero- and few-shot performance demonstrates better transferability of the shared prompt representation.

Problem

Research questions and friction points this paper is trying to address.

clinical NLP

prompt-based fine-tuning

parameter efficiency

transfer learning

multitask learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt distillation

parameter-efficient transfer learning

multitask learning