🤖 AI Summary
This work addresses the performance degradation of streaming decision trees in online class-incremental learning, which stems from the inaccuracy of information gain as the number of classes grows and the absence of effective knowledge transfer mechanisms. To overcome these limitations, the authors propose the MIST framework, which introduces McDiarmid’s confidence radius—agnostic to the number of classes—as a splitting criterion, designs a Bayesian statistical inheritance mechanism with provable variance reduction guarantees, and integrates KLL quantile sketches enabling continuous threshold evaluation and geometry-adaptive prediction. Experimental results demonstrate that MIST matches the performance of global parametric methods on near-Gaussian data under both standard and stress-test conditions, while significantly outperforming existing state-of-the-art approaches in non-Gaussian, geometrically complex scenarios.
📝 Abstract
Streaming decision trees are natural candidates for open-world continual learning, as they perform local updates, enjoy bounded memory, and static decision boundaries. Despite these, they still fail in online class-incremental learning due to two coupled miscalibrations: (i) their split criterion grows unreliable as the class count K expands, and (ii) the absence of knowledge transfer at split time. Both failures share a common root: the range of Information Gain intrinsically scales with log2 K. Consequently, any Hoeffding-style confidence radius derived from it must inevitably grow with the class count, making a K-independent split criterion structurally impossible, taking away the potential benefits of applying streaming decision trees to continual learning. To fix this issue, we present MIST (McDiarmid Incremental Streaming Tree), which resolves both failures through three integrated components: (i) a tight, K-independent McDiarmid confidence radius for Gini splitting that acts as a structural regulariser; (ii) a Bayesian inheritance protocol that projects parent statistics to child nodes via truncated-Gaussian moments, with variance reduction guarantees strongest precisely when splitting is most conservative; and (iii) per-leaf KLL quantile sketches that support both continuous threshold evaluation and geometry-adaptive leaf prediction from a single data structure. On standard and stress-test tabular streams, MIST is competitive with global parametric methods on near-Gaussian benchmarks and uniquely robust on non-Gaussian geometry where SOTA benchmarks collapse.