🤖 AI Summary
Large language models (LLMs) exhibit limited generalization to low-resource and unseen tasks; existing cross-task in-context learning approaches suffer from poor robustness, weak scalability, and high computational overhead. To address these limitations, we propose CAST—a novel framework for parameter-free, input-length-invariant latent-space activation-guided knowledge transfer across tasks. CAST dynamically steers hidden-layer representations for low-resource tasks by leveraging contrastively enhanced, representative activation patterns extracted from high-resource tasks. It integrates latent-space activation analysis, representative sample selection, and contrastive representation enhancement. Evaluated on cross-domain and cross-lingual transfer benchmarks, CAST significantly outperforms strong baselines, achieving an average accuracy gain of 4.2% while reducing inference latency by 37%. The method is highly efficient, scalable, and robust—demonstrating consistent performance across diverse task distributions and data regimes.
📝 Abstract
Large language models (LLMs) have shown impressive abilities in leveraging pretrained knowledge through prompting, but they often struggle with unseen tasks, particularly in data-scarce scenarios. While cross-task in-context learning offers a direct solution for transferring knowledge across tasks, it still faces critical challenges in terms of robustness, scalability, and efficiency. In this paper, we investigate whether cross-task transfer can be achieved via latent space steering without parameter updates or input expansion. Through an analysis of activation patterns in the latent space of LLMs, we observe that the enhanced activations induced by in-context examples have consistent patterns across different tasks. Inspired by these findings, we propose CAST, a novel Cross-task Activation Steering Transfer framework that enables effective transfer by manipulating the model's internal activation states. Our approach first selects influential and diverse samples from high-resource tasks, then utilizes their contrastive representation-enhanced activations to adapt LLMs to low-resource tasks. Extensive experiments across both cross-domain and cross-lingual transfer settings show that our method outperforms competitive baselines and demonstrates superior scalability and lower computational costs.