Enhancing Cross-task Transfer of Large Language Models via Activation Steering

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Large language models (LLMs) exhibit limited generalization to low-resource and unseen tasks; existing cross-task in-context learning approaches suffer from poor robustness, weak scalability, and high computational overhead. To address these limitations, we propose CAST—a novel framework for parameter-free, input-length-invariant latent-space activation-guided knowledge transfer across tasks. CAST dynamically steers hidden-layer representations for low-resource tasks by leveraging contrastively enhanced, representative activation patterns extracted from high-resource tasks. It integrates latent-space activation analysis, representative sample selection, and contrastive representation enhancement. Evaluated on cross-domain and cross-lingual transfer benchmarks, CAST significantly outperforms strong baselines, achieving an average accuracy gain of 4.2% while reducing inference latency by 37%. The method is highly efficient, scalable, and robust—demonstrating consistent performance across diverse task distributions and data regimes.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have shown impressive abilities in leveraging pretrained knowledge through prompting, but they often struggle with unseen tasks, particularly in data-scarce scenarios. While cross-task in-context learning offers a direct solution for transferring knowledge across tasks, it still faces critical challenges in terms of robustness, scalability, and efficiency. In this paper, we investigate whether cross-task transfer can be achieved via latent space steering without parameter updates or input expansion. Through an analysis of activation patterns in the latent space of LLMs, we observe that the enhanced activations induced by in-context examples have consistent patterns across different tasks. Inspired by these findings, we propose CAST, a novel Cross-task Activation Steering Transfer framework that enables effective transfer by manipulating the model's internal activation states. Our approach first selects influential and diverse samples from high-resource tasks, then utilizes their contrastive representation-enhanced activations to adapt LLMs to low-resource tasks. Extensive experiments across both cross-domain and cross-lingual transfer settings show that our method outperforms competitive baselines and demonstrates superior scalability and lower computational costs.

Problem

Research questions and friction points this paper is trying to address.

Improving cross-task transfer in LLMs without parameter updates

Addressing robustness and scalability in cross-task knowledge transfer

Enhancing efficiency of LLMs in data-scarce task scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent space steering without parameter updates

Cross-task Activation Steering Transfer framework

Contrastive representation-enhanced activations for adaptation

🔎 Similar Papers

No similar papers found.