🤖 AI Summary
Traditional pointwise ranking models for Chinese text reranking suffer from limited capacity to capture semantic contrasts among candidate passages and low training efficiency. To address these issues, this paper proposes ListConRanker, a listwise contrastive reranking framework. Its core innovations include: (1) the first introduction of a listwise Transformer encoding mechanism that explicitly models intra-list semantic contrast relationships in a single forward pass; and (2) deep integration of listwise encoding with contrastive learning, coupled with Circle Loss—replacing standard cross-entropy—to significantly improve gradient stability and convergence speed. Extensive evaluation on major Chinese reranking benchmarks—including cMedQA1.0/2.0, MMarcoReranking, and T2Reranking—demonstrates that ListConRanker achieves state-of-the-art performance across NDCG@10 and MRR metrics. This work establishes a new, efficient, and robust paradigm for reranking in retrieval-augmented generation.
📝 Abstract
Reranker models aim to re-rank the passages based on the semantics similarity between the given query and passages, which have recently received more attention due to the wide application of the Retrieval-Augmented Generation. Most previous methods apply pointwise encoding, meaning that it can only encode the context of the query for each passage input into the model. However, for the reranker model, given a query, the comparison results between passages are even more important, which is called listwise encoding. Besides, previous models are trained using the cross-entropy loss function, which leads to issues of unsmooth gradient changes during training and low training efficiency. To address these issues, we propose a novel Listwise-encoded Contrastive text reRanker (ListConRanker). It can help the passage to be compared with other passages during the encoding process, and enhance the contrastive information between positive examples and between positive and negative examples. At the same time, we use the circle loss to train the model to increase the flexibility of gradients and solve the problem of training efficiency. Experimental results show that ListConRanker achieves state-of-the-art performance on the reranking benchmark of Chinese Massive Text Embedding Benchmark, including the cMedQA1.0, cMedQA2.0, MMarcoReranking, and T2Reranking datasets.