🤖 AI Summary
This work addresses the significant performance degradation of multi-task learning under target data scarcity and the limited applicability of existing transfer methods that rely on stringent bounded discrepancy assumptions between source and target tasks. The authors propose SMART, a novel approach that abandons traditional bounded-difference conditions in favor of a spectral similarity assumption between the singular subspaces of source and target models. By incorporating the source model’s spectral information into the estimation of the target coefficient matrix via structured regularization, SMART enables privacy-preserving knowledge transfer using only the pre-trained source model parameters—without requiring access to the original source data. Built upon a non-convex optimization framework and leveraging ADMM, singular value decomposition, and sparsity alignment constraints, the algorithm is both efficient and scalable. Theoretical analysis establishes non-asymptotic error bounds and minimax lower bounds, achieving near-optimal Frobenius error rates under noisy source settings. Experiments demonstrate substantial improvements in estimation accuracy, robustness against negative transfer, and superior predictive performance on multimodal single-cell data.
📝 Abstract
Multi-task learning is effective for related applications, but its performance can deteriorate when the target sample size is small. Transfer learning can borrow strength from related studies; yet, many existing methods rely on restrictive bounded-difference assumptions between the source and target models. We propose SMART, a spectral transfer method for multi-task linear regression that instead assumes spectral similarity: the target left and right singular subspaces lie within the corresponding source subspaces and are sparsely aligned with the source singular bases. Such an assumption is natural when studies share latent structures and enables transfer beyond the bounded-difference settings. SMART estimates the target coefficient matrix through structured regularization that incorporates spectral information from a source study. Importantly, it requires only a fitted source model rather than the raw source data, making it useful when data sharing is limited. Although the optimization problem is nonconvex, we develop a practical ADMM-based algorithm. We establish general, non-asymptotic error bounds and a minimax lower bound in the noiseless-source regime. Under additional regularity conditions, these results yield near-minimax Frobenius error rates up to logarithmic factors. Simulations confirm improved estimation accuracy and robustness to negative transfer, and analysis of multi-modal single-cell data demonstrates better predictive performance. The Python implementation of SMART, along with the code to reproduce all experiments in this paper, is publicly available at https://github.com/boxinz17/smart.