🤖 AI Summary
To address the limitation of conventional time-series imputation methods—which ignore class label information in time-series classification tasks—this paper proposes a label-guided supervised imputation framework. Our approach constructs a label-constrained proximity matrix using tree-based models (e.g., random forests or gradient-boosted trees), thereby embedding class semantics into the similarity metric and enabling conditional, weighted imputation. Unlike unsupervised imputation, our method explicitly models the joint distribution of labels and time-series observations, yielding discriminative imputations that better preserve class-separability. Experiments demonstrate that, despite non-zero reconstruction error in imputed values, downstream classification accuracy is significantly improved—validating the critical role of label guidance in time-series imputation. This work establishes a novel paradigm for supervised time-series imputation.
📝 Abstract
Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in the context of time series classification, where each time series is associated with a categorical label. We define a means of imputing missing values conditional upon labels, the method being guided by powerful, existing supervised models designed for high accuracy in this task. From each model, we extract a tree-based proximity measure from which imputation can be applied. We show that imputation using this method generally provides richer information leading to higher classification accuracies, despite the imputed values differing from the true values.