On the Learning with Augmented Class via Forests

📅 2025-05-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the open-set recognition challenge where novel classes—unseen during training—emerge at test time. We propose a decision-tree/forest-based incremental open-set learning framework. Our method introduces (1) an augmented Gini impurity criterion that incorporates unlabeled test data distributions to enhance class boundary discrimination; (2) LACForest, a unified architecture supporting both shallow and end-to-end trainable deep neural forests; and (3) theoretical convergence analysis under mild assumptions. Leveraging pseudo-labeling and unsupervised distribution alignment heuristics, our approach significantly improves novel-class identification accuracy across multiple open-category benchmarks. Extensive experiments demonstrate superior performance over state-of-the-art methods, validating both effectiveness and generalizability. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Decision trees and forests have achieved successes in various real applications, most working with all testing classes known in training data. In this work, we focus on learning with augmented class via forests, where an augmented class may appear in testing data yet not in training data. We incorporate information of augmented class into trees' splitting, i.e., a new splitting criterion, called augmented Gini impurity, is introduced to exploit some unlabeled data from testing distribution. We then develop the approach named Learning with Augmented Class via Forests (LACForest), which constructs shallow forests based on the augmented Gini impurity and then splits forests with pseudo-labeled augmented instances for better performance. We also develop deep neural forests with a novel optimization objective based on our augmented Gini impurity, so as to utilize the representation power of neural networks for forests. Theoretically, we present the convergence analysis for augmented Gini impurity, and finally conduct experiments to verify the effectiveness of our approaches. The code is available at https://github.com/nju-xuf/LACForest/.
Problem

Research questions and friction points this paper is trying to address.

Handling augmented classes unseen in training data
Introducing augmented Gini impurity for tree splitting
Developing LACForest for improved classification performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces augmented Gini impurity for tree splitting
Develops LACForest with shallow and deep neural forests
Uses pseudo-labeled augmented instances for better performance
🔎 Similar Papers
No similar papers found.
F
Fan Xu
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
Wuyang Chen
Wuyang Chen
Assistant Professor, CS@Simon Fraser University
Scientific Machine LearningComputer VisionLarge Language ModelsReasoning
W
Wei Gao
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China