๐ค AI Summary
This study addresses sentence-level gender bias detection and mitigation in Chinese text, proposing an end-to-end fairness-enhancement framework. Methodologically: (1) it employs efficient LoRA-based fine-tuning of large language models, augmented by multi-model ensemble voting to improve detection robustness; (2) it constructs a balanced, heterogeneous dataset encompassing diverse bias expressions from multiple sources; and (3) it introduces a multi-temperature sampling strategy to dynamically regulate the generation process for fine-grained bias control. The key innovation lies in the synergistic integration of lightweight adaptation, ensemble learning, and controllable decodingโjointly optimizing both bias identification accuracy and generation controllability. Evaluated on the NLPCC-2025 shared task, the framework achieves a 47.90% average F1-score, ranking fourth overall, thereby demonstrating its dual efficacy in accurate bias detection and effective bias mitigation.
๐ Abstract
This paper presents our team's solution to Shared Task 7 of NLPCC-2025, which focuses on sentence-level gender bias detection and mitigation in Chinese. The task aims to promote fairness and controllability in natural language generation by automatically detecting, classifying, and mitigating gender bias. To address this challenge, we adopt a fine-tuning approach based on large language models (LLMs), efficiently adapt to the bias detection task via Low-Rank Adaptation (LoRA). In terms of data processing, we construct a more balanced training set to alleviate class imbalance and introduce heterogeneous samples from multiple sources to enhance model generalization. For the detection and classification sub-tasks, we employ a majority voting strategy that integrates outputs from multiple expert models to boost performance. Additionally, to improve bias generation detection and mitigation, we design a multi-temperature sampling mechanism to capture potential variations in bias expression styles. Experimental results demonstrate the effectiveness of our approach in bias detection, classification, and mitigation. Our method ultimately achieves an average score of 47.90%, ranking fourth in the shared task.