MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
This work addresses the performance degradation of streaming decision trees in online class-incremental learning, which stems from the inaccuracy of information gain as the number of classes grows and the absence of effective knowledge transfer mechanisms. To overcome these limitations, the authors propose the MIST framework, which introduces McDiarmid’s confidence radius—agnostic to the number of classes—as a splitting criterion, designs a Bayesian statistical inheritance mechanism with provable variance reduction guarantees, and integrates KLL quantile sketches enabling continuous threshold evaluation and geometry-adaptive prediction. Experimental results demonstrate that MIST matches the performance of global parametric methods on near-Gaussian data under both standard and stress-test conditions, while significantly outperforming existing state-of-the-art approaches in non-Gaussian, geometrically complex scenarios.
📝 Abstract
Streaming decision trees are natural candidates for open-world continual learning, as they perform local updates, enjoy bounded memory, and static decision boundaries. Despite these, they still fail in online class-incremental learning due to two coupled miscalibrations: (i) their split criterion grows unreliable as the class count K expands, and (ii) the absence of knowledge transfer at split time. Both failures share a common root: the range of Information Gain intrinsically scales with log2 K. Consequently, any Hoeffding-style confidence radius derived from it must inevitably grow with the class count, making a K-independent split criterion structurally impossible, taking away the potential benefits of applying streaming decision trees to continual learning. To fix this issue, we present MIST (McDiarmid Incremental Streaming Tree), which resolves both failures through three integrated components: (i) a tight, K-independent McDiarmid confidence radius for Gini splitting that acts as a structural regulariser; (ii) a Bayesian inheritance protocol that projects parent statistics to child nodes via truncated-Gaussian moments, with variance reduction guarantees strongest precisely when splitting is most conservative; and (iii) per-leaf KLL quantile sketches that support both continuous threshold evaluation and geometry-adaptive leaf prediction from a single data structure. On standard and stress-test tabular streams, MIST is competitive with global parametric methods on near-Gaussian benchmarks and uniquely robust on non-Gaussian geometry where SOTA benchmarks collapse.
Problem

Research questions and friction points this paper is trying to address.

online class-incremental learning
streaming decision trees
Information Gain
split criterion
knowledge transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

streaming decision trees
online class-incremental learning
McDiarmid bound
Bayesian inheritance
KLL quantile sketches
🔎 Similar Papers
P
Phu-Hoa Pham
Faculty of Information and Technology, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam
C
Chi-Nguyen Tran
Faculty of Information and Technology, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam
N
Nguyen Lam Phu Quy
Faculty of Information and Technology, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam
D
Dao Sy Duy Minh
Faculty of Information and Technology, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam
H
Huynh Trung Kiet
Faculty of Information and Technology, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam
Long Tran-Thanh
Long Tran-Thanh
Professor in Computer Science, University of Warwick
Artificial IntelligenceAI for social goodgame theoryhuman-agent learningmulti-armed bandits