🤖 AI Summary
This work addresses the limitation of conventional fine-tuning approaches, which optimize only local neighborhoods and overlook the global semantic structure inherent in language model embedding spaces. To overcome this, the authors propose G-Loss, a novel framework that, for the first time, integrates graph-based structures and semi-supervised label propagation into the fine-tuning process. By constructing a document similarity graph and propagating label information across it, G-Loss explicitly models global semantic relationships, guiding the model to learn more discriminative embeddings. Empirical evaluations on five benchmark datasets demonstrate that this approach significantly improves classification accuracy, accelerates convergence, and yields embedding spaces with stronger semantic coherence.
📝 Abstract
Traditional loss functions, including cross-entropy, contrastive, triplet, and su pervised contrastive losses, used for fine-tuning pre-trained language models such as BERT, operate only within local neighborhoods and fail to account for the global semantic structure. We present G-Loss, a graph-guided loss function that incorporates semi-supervised label propagation to use structural relationships within the embedding manifold. G-Loss builds a document-similarity graph that captures global semantic relationships, thereby guiding the model to learn more discriminative and robust embeddings. We evaluate G-Loss on five benchmark datasets covering key downstream classification tasks: MR (sentiment analysis), R8 and R52 (topic categorization), Ohsumed (medical document classification), and 20NG (news categorization). In the majority of experimental setups, G-Loss converges faster and produces semantically coherent embedding spaces, resulting in higher classification accuracy than models fine-tuned with traditional loss functions.