Cross-Semantic Transfer Learning for High-Dimensional Linear Regression

📅 2025-12-25

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Existing high-dimensional linear regression transfer learning methods rely on feature alignment assumptions, rendering them inadequate for scenarios where source and target domains exhibit semantic inconsistency yet functional similarity. To address this limitation, we propose Cross-Semantic Transfer Learning (CSTL), the first framework to relax the feature alignment requirement. CSTL employs SCAD-derivative-based adaptive weighting to fuse source-domain coefficients, preserving transferable signals while suppressing noise—achieving theoretically optimal bias-variance trade-off. It further introduces a weighted fusion penalty and an efficient ADMM-based optimization algorithm, accompanied by rigorous convergence guarantees for the oracle estimator. Extensive experiments on synthetic and real-world datasets demonstrate that CSTL significantly outperforms state-of-the-art methods, reducing estimation error by over 30% in cross-semantic and partially signal-similar settings.

Technology Category

Application Category

📝 Abstract

Current transfer learning methods for high-dimensional linear regression assume feature alignment across domains, restricting their applicability to semantically matched features. In many real-world scenarios, however, distinct features in the target and source domains can play similar predictive roles, creating a form of cross-semantic similarity. To leverage this broader transferability, we propose the Cross-Semantic Transfer Learning (CSTL) framework. It captures potential relationships by comparing each target coefficient with all source coefficients through a weighted fusion penalty. The weights are derived from the derivative of the SCAD penalty, effectively approximating an ideal weighting scheme that preserves transferable signals while filtering out source-specific noise. For computational efficiency, we implement CSTL using the Alternating Direction Method of Multipliers (ADMM). Theoretically, we establish that under mild conditions, CSTL achieves the oracle estimator with overwhelming probability. Empirical results from simulations and a real-data application confirm that CSTL outperforms existing methods in both cross-semantic and partial signal similarity settings.

Problem

Research questions and friction points this paper is trying to address.

Enables transfer learning across domains with semantically different features.

Uses weighted fusion penalty to identify and leverage cross-semantic similarities.

Achieves oracle estimator performance under mild theoretical conditions.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses weighted fusion penalty to capture cross-semantic coefficient relationships

Implements ADMM for computational efficiency in optimization

Derives weights from SCAD penalty to filter noise and preserve signals

🔎 Similar Papers

No similar papers found.