🤖 AI Summary
To address weak domain generalization and high annotation costs in multi-label text classification, this paper proposes a domain-agnostic generative framework: predefined labels are mapped to natural language semantic descriptions, and the model directly generates these descriptions; a fine-tuned Sentence Transformer then aligns the generated outputs with ground-truth label descriptions. Our key contribution is the first formulation of labels as generative semantic units, coupled with a dual-objective joint optimization—cross-entropy loss for label identification and cosine similarity loss to enforce semantic fidelity of generated descriptions. The method is parameter-efficient and enables zero-shot cross-domain transfer without domain adaptation. Evaluated on multiple standard benchmarks, it achieves state-of-the-art performance, improving Micro-F1 and Macro-F1 by 13.94% and 24.85%, respectively, over the latest baseline.
📝 Abstract
The explosion of textual data has made manual document classification increasingly challenging. To address this, we introduce a robust, efficient domain-agnostic generative model framework for multi-label text classification. Instead of treating labels as mere atomic symbols, our approach utilizes predefined label descriptions and is trained to generate these descriptions based on the input text. During inference, the generated descriptions are matched to the pre-defined labels using a finetuned sentence transformer. We integrate this with a dual-objective loss function, combining cross-entropy loss and cosine similarity of the generated sentences with the predefined target descriptions, ensuring both semantic alignment and accuracy. Our proposed model LAGAMC stands out for its parameter efficiency and versatility across diverse datasets, making it well-suited for practical applications. We demonstrate the effectiveness of our proposed model by achieving new state-of-the-art performances across all evaluated datasets, surpassing several strong baselines. We achieve improvements of 13.94% in Micro-F1 and 24.85% in Macro-F1 compared to the closest baseline across all datasets.