Decision Boundary-aware Generation for Long-tailed Learning

📅 2026-05-02

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the challenge of long-tailed classification, where imbalanced data distributions bias decision boundaries toward head classes and degrade performance on tail classes. Existing generative augmentation methods often exacerbate inter-class feature entanglement and boundary ambiguity during knowledge transfer from head to tail classes. To mitigate these issues, the authors propose a Decision Boundary-aware Generation (DBG) framework that explicitly focuses on synthesizing high-informative samples near class decision boundaries. Leveraging a diffusion model, DBG generates boundary-proximal instances that enhance representation learning while alleviating distributional shift and inter-class mixing for tail classes. Experimental results on standard long-tailed benchmarks demonstrate that DBG significantly improves both overall and tail-class accuracy and effectively reduces inter-class overlap, yielding sharper and better-calibrated decision boundaries.

📝 Abstract

Long-tailed data bias decision boundaries toward head classes and degrade tail class accuracy. Diffusion-based generative augmentation address this problem by generating additional data, while head-to-tail transfer further mitigate the generator bias inherit from long-tailed dataset. However, we show that while head-to-tail transfer helps balance the decision space of the classifier, it also induces latent non-local feature mixing that entangles inter-class features, causing decision boundary overlap and tail class distribution shift. To address this, we first identify the problem of boundary ambiguity and then propose Decision Boundary-aware Generation (DBG) framework, which promotes near-boundary representation learning by generating informative near-boundary samples. Overall, DBG rebalances the long-tailed dataset while yielding more separable decision space for long-tailed learning. Across standard long-tailed benchmarks, DBG consistently improves tail class and overall accuracy with less inter-class overlap. The code of DBG is available at https://github.com/keepdigitalabc-svg/DBG.

Problem

Research questions and friction points this paper is trying to address.

long-tailed learning

decision boundary

feature entanglement

class imbalance

generative augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decision Boundary-aware Generation

Long-tailed Learning

Diffusion-based Augmentation