๐ค AI Summary
In long-tailed visual recognition, models suffer significant performance degradation on tail classes primarily due to insufficient semantic abstraction capability for learning discriminative features. To address this, we propose a semantic-driven superclass graph modeling framework. Our approach introduces, for the first time, a meta-learningโdriven superclass discovery mechanism, guided by prototype-based graph supervision to construct semantic hierarchies, and employs dynamic graph message passing to recalibrate image representations. By unifying meta-learning, prototypical learning, and graph neural networks, our method achieves robust representation optimization at the semantic abstraction level. Extensive experiments on long-tailed CIFAR-100, ImageNet, Places, and iNaturalist demonstrate consistent superiority over state-of-the-art methods, with substantial improvements in tail-class accuracy and multiple metrics achieving new SOTA results.
๐ Abstract
Modern image classifiers perform well on populated classes, while degrading considerably on tail classes with only a few instances. Humans, by contrast, effortlessly handle the long-tailed recognition challenge, since they can learn the tail representation based on different levels of semantic abstraction, making the learned tail features more discriminative. This phenomenon motivated us to propose SuperDisco, an algorithm that discovers super-class representations for long-tailed recognition using a graph model. We learn to construct the super-class graph to guide the representation learning to deal with long-tailed distributions. Through message passing on the super-class graph, image representations are rectified and refined by attending to the most relevant entities based on the semantic similarity among their super-classes. Moreover, we propose to meta-learn the super-class graph under the supervision of a prototype graph constructed from a small amount of imbalanced data. By doing so, we obtain a more robust super-class graph that further improves the long-tailed recognition performance. The consistent state-of-the-art experiments on the long-tailed CIFAR-100, ImageNet, Places and iNaturalist demonstrate the benefit of the discovered super-class graph for dealing with long-tailed distributions.