Acquiring and Accumulating Knowledge from Diverse Datasets for Multi-label Driving Scene Classification

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing two key challenges in multi-label driving-scene classification—ineffective utilization of heterogeneous single-label data and imbalanced learning across tasks—this paper proposes a synergistic framework integrating Knowledge Acquisition and Accumulation (KAA) with Consistency-driven Active Learning (CAL). KAA enables efficient construction of multi-label cognition from heterogeneous single-label sources via monotonic learning, multi-task knowledge distillation, and consistency regularization. CAL enhances label efficiency through a novel active sampling strategy aligned with both margin and joint distribution. Evaluated on the newly introduced DSI dataset, our method achieves a 56.1% performance gain over ImageNet-pretrained baselines. On BDD100K and HSD benchmarks, it surpasses state-of-the-art approaches, attaining optimal accuracy using only 15% of annotated data. This demonstrates significant improvements in annotation efficiency and cross-dataset generalization.

Technology Category

Application Category

📝 Abstract
Driving scene identification, which assigns multiple non-exclusive class labels to a scene, provides the contextual awareness necessary for enhancing autonomous vehicles'ability to understand, reason about, and interact with the complex driving environment. As a multi-label classification problem, it is better tackled via multitasking learning. However, directly training a multi-label classification model for driving scene identification through multitask learning presents two main challenges: acquiring a balanced, comprehensively annotated multi-label dataset and balancing learning across different tasks. This paper introduces a novel learning system that synergizes knowledge acquisition and accumulation (KAA) with consistency-based active learning (CAL) to address those challenges. KAA acquires and accumulates knowledge about scene identification from various single-label datasets via monotask learning. Subsequently, CAL effectively resolves the knowledge gap caused by the discrepancy between the marginal distributions of individual attributes and their joint distribution. An ablation study on our Driving Scene Identification (DSI) dataset demonstrates a 56.1% performance increase over the baseline model pretrained on ImageNet. Of this, KAA accounts for 31.3% of the gain, and CAL contributes 24.8%. Moreover, KAA-CAL stands out as the best performer when compared to state-of-the-art (SOTA) multi-label models on two public datasets, BDD100K and HSD, achieving this while using 85% less data. The DSI dataset and the implementation code for KAA-CAL are available at https://github.com/KELISBU/KAA-CAL .
Problem

Research questions and friction points this paper is trying to address.

Balancing multi-label dataset acquisition for driving scenes
Addressing task imbalance in multi-label classification learning
Bridging knowledge gaps between single and joint attribute distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge acquisition from single-label datasets
Consistency-based active learning integration
Synergized KAA-CAL for multi-label classification
🔎 Similar Papers
No similar papers found.
K
Ke Li
Department of Civil Engineering, Stony Brook University, Stony Brook, NY 11794, USA
C
Chenyu Zhang
Department of Civil Engineering, Stony Brook University, Stony Brook, NY 11794, USA
Y
Yuxin Ding
Department of Civil and Environmental Engineering, Pennsylvania State University, University Park, PA 16802, USA
X
Xianbiao Hu
Department of Civil and Environmental Engineering, Pennsylvania State University, University Park, PA 16802, USA
Ruwen Qin
Ruwen Qin
Stony Brook University
Visual Perception and CognitionCollective Intelligence