Transformers Learn Robust In-Context Regression under Distributional Uncertainty

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the in-context learning capability of Transformers for noisy linear regression under realistic distributional shifts, including non-Gaussian noise, heavy-tailed distributions, and non-i.i.d. prompts—conditions under which conventional in-context linear regression methods often degrade. The study systematically evaluates Transformer performance against classical maximum likelihood estimators across a range of non-ideal settings. Results demonstrate that Transformers consistently match or surpass optimal and near-optimal statistical baselines across diverse distributional shift scenarios, exhibiting remarkable robustness and adaptability. This is the first empirical validation that Transformers can reliably perform in-context regression tasks even under substantial distributional uncertainty, highlighting their potential as flexible and resilient learners in complex, real-world environments.

Technology Category

Application Category

📝 Abstract
Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or outperform these baselines, demonstrating robust in-context adaptation beyond classical estimators.
Problem

Research questions and friction points this paper is trying to address.

in-context learning
distributional uncertainty
linear regression
Transformers
non-i.i.d. data
Innovation

Methods, ideas, or system contributions that make the work stand out.

in-context learning
distributional uncertainty
Transformers
robust regression
non-i.i.d. prompts
🔎 Similar Papers
No similar papers found.
H
Hoang T. H. Cao
Faculty of Computer Science and Engineering, Ho Chi Minh University of Technology (HCMUT), Vietnam National University Ho Chi Minh City (VNU-HCM), Vietnam
H
Hai D. V. Trinh
Faculty of Computer Science and Engineering, Ho Chi Minh University of Technology (HCMUT), Vietnam National University Ho Chi Minh City (VNU-HCM), Vietnam
Tho Quan
Tho Quan
Unknown affiliation
L
Lan V. Truong
Faculty of Computer Science and Engineering, Ho Chi Minh University of Technology (HCMUT), Vietnam National University Ho Chi Minh City (VNU-HCM), Vietnam