🤖 AI Summary
Pointwise loss functions (e.g., MSE) in time series modeling induce optimization bias due to their implicit i.i.d. assumption, which violates temporal causality and ignores inherent serial dependence. Method: Under covariance stationarity, we formally characterize this bias as the Expected Optimization Bias (EOB) from an information-theoretic perspective—proving EOB is an intrinsic data property determined solely by sequence length and the Structure Signal-to-Noise Ratio (SSNR). We derive a closed-form solution for non-deterministic EOB and propose a debiasing paradigm based on sequence truncation and structural orthogonalization, coupled with a harmonic ℓ_p-norm framework to mitigate gradient ill-conditioning. Contribution/Results: Theoretical analysis establishes the universality of EOB; experiments demonstrate significant improvements in prediction accuracy and training stability across diverse time series tasks, empirically validating SSNR as the key controllable factor governing optimization bias.
📝 Abstract
Optimizing time series models via point-wise loss functions (e.g., MSE) relying on a flawed point-wise independent and identically distributed (i.i.d.) assumption that disregards the causal temporal structure, an issue with growing awareness yet lacking formal theoretical grounding. Focusing on the core independence issue under covariance stationarity, this paper aims to provide a first-principles analysis of the Expectation of Optimization Bias (EOB), formalizing it information-theoretically as the discrepancy between the true joint distribution and its flawed i.i.d. counterpart. Our analysis reveals a fundamental paradigm paradox: the more deterministic and structured the time series, the more severe the bias by point-wise loss function. We derive the first closed-form quantification for the non-deterministic EOB across linear and non-linear systems, and prove EOB is an intrinsic data property, governed exclusively by sequence length and our proposed Structural Signal-to-Noise Ratio (SSNR). This theoretical diagnosis motivates our principled debiasing program that eliminates the bias through sequence length reduction and structural orthogonalization. We present a concrete solution that simultaneously achieves both principles via DFT or DWT. Furthermore, a novel harmonized $ell_p$ norm framework is proposed to rectify gradient pathologies of high-variance series. Extensive experiments validate EOB Theory's generality and the superior performance of debiasing program.