🤖 AI Summary
This work addresses the challenge of class imbalance in natural language processing by proposing the Hardness-Aware Meta-Resample (HAMR) framework, which uniquely integrates meta-learning with neighborhood-aware resampling. HAMR employs a bilevel optimization strategy to dynamically estimate instance-level sample weights, adaptively emphasizing difficult samples along with their semantically similar neighbors to collaboratively enhance minority-class representations. The framework is model-agnostic and demonstrates strong cross-domain generalization. Evaluated on six imbalanced datasets spanning biomedical text, disaster response, and sentiment analysis, HAMR consistently outperforms strong baselines—particularly improving performance on minority classes—and ablation studies confirm the contribution of each component to its overall effectiveness.
📝 Abstract
Class imbalance is a widespread challenge in NLP tasks, significantly hindering robust performance across diverse domains and applications. We introduce Hardness-Aware Meta-Resample (HAMR), a unified framework that adaptively addresses both class imbalance and data difficulty. HAMR employs bi-level optimizations to dynamically estimate instance-level weights that prioritize genuinely challenging samples and minority classes, while a neighborhood-aware resampling mechanism amplifies training focus on hard examples and their semantically similar neighbors. We validate HAMR on six imbalanced datasets covering multiple tasks and spanning biomedical, disaster response, and sentiment domains. Experimental results show that HAMR achieves substantial improvements for minority classes and consistently outperforms strong baselines. Extensive ablation studies demonstrate that our proposed modules synergistically contribute to performance gains and highlight HAMR as a flexible and generalizable approach for class imbalance adaptation. Code is available at https://github.com/trust-nlp/ImbalanceLearning.