Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the limitation of conventional time-series imputation methods—which ignore class label information in time-series classification tasks—this paper proposes a label-guided supervised imputation framework. Our approach constructs a label-constrained proximity matrix using tree-based models (e.g., random forests or gradient-boosted trees), thereby embedding class semantics into the similarity metric and enabling conditional, weighted imputation. Unlike unsupervised imputation, our method explicitly models the joint distribution of labels and time-series observations, yielding discriminative imputations that better preserve class-separability. Experiments demonstrate that, despite non-zero reconstruction error in imputed values, downstream classification accuracy is significantly improved—validating the critical role of label guidance in time-series imputation. This work establishes a novel paradigm for supervised time-series imputation.

Technology Category

Application Category

📝 Abstract

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in the context of time series classification, where each time series is associated with a categorical label. We define a means of imputing missing values conditional upon labels, the method being guided by powerful, existing supervised models designed for high accuracy in this task. From each model, we extract a tree-based proximity measure from which imputation can be applied. We show that imputation using this method generally provides richer information leading to higher classification accuracies, despite the imputed values differing from the true values.

Problem

Research questions and friction points this paper is trying to address.

Imputing missing values in time series classification tasks

Leveraging label information through forest-based proximity measures

Improving classification accuracy despite imperfect imputed values

Innovation

Methods, ideas, or system contributions that make the work stand out.

Label-guided imputation using forest-based proximities

Tree-based proximity measures extracted from supervised models

Conditional imputation on labels for time series classification

🔎 Similar Papers

Forest Proximities for Time Series