TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-based ranking models rely on large-scale language models and explicit chain-of-thought (CoT) reasoning, incurring high computational overhead and latency, hindering practical deployment. To address this, we propose TFRank—a lightweight pointwise ranking model based on small LLMs (e.g., 1.7B parameters)—featuring a novel “reasoning-mode switching” mechanism. During training, TFRank jointly learns from CoT-annotated data and fine-grained relevance scores via multi-task learning; during inference, it bypasses implicit CoT generation and directly outputs relevance scores, effectively decoupling inference from reasoning. This design eliminates redundant text generation, drastically reducing latency and resource consumption. Experiments show that TFRank matches the performance of models with four times its parameter count on BRIGHT, while remaining highly competitive on BEIR. TFRank thus provides a practical, high-accuracy, low-overhead solution for real-world retrieval systems.

Technology Category

Application Category

📝 Abstract
Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress, but existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning, resulting in high computational cost and latency that limit real-world use. To address this, we propose extbf{TFRank}, an efficient pointwise reasoning ranker based on small-scale LLMs. To improve ranking performance, TFRank effectively integrates CoT data, fine-grained score supervision, and multi-task training. Furthermore, it achieves an efficient `` extbf{T}hink- extbf{F}ree" reasoning capability by employing a ``think-mode switch'' and pointwise format constraints. Specifically, this allows the model to leverage explicit reasoning during training while delivering precise relevance scores for complex queries at inference without generating any reasoning chains. Experiments show that TFRank (e.g., 1.7B) achieves performance comparable to models with four times more parameters on the BRIGHT benchmark, and demonstrates strong competitiveness on the BEIR benchmark. Further analysis shows that TFRank achieves an effective balance between performance and efficiency, providing a practical solution for integrating advanced reasoning into real-world systems. Our code and data are released in the repository: https://github.com/JOHNNY-fans/TFRank.
Problem

Research questions and friction points this paper is trying to address.

Reduces computational cost in LLM ranking models
Enhances ranking performance with efficient reasoning
Balances performance and efficiency for real-world use
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient pointwise ranker with small-scale LLMs
Integrates CoT data and fine-grained supervision
Think-free reasoning via mode switch and constraints
🔎 Similar Papers
No similar papers found.
Yongqi Fan
Yongqi Fan
East China University of Science and Technology
LLMAI SearchMedical NLPIRAgentic RL
X
Xiaoyang Chen
University of Chinese Academy of Sciences
D
Dezhi Ye
Tencent
J
Jie Liu
Tencent
H
Haijin Liang
Tencent
J
Jin Ma
Tencent
Ben He
Ben He
Professor, University of Chinese Academy of Sciences
Natural Language ProcessingInformation Retrieval
Y
Yingfei Sun
University of Chinese Academy of Sciences
Tong Ruan
Tong Ruan
East China University of Science and Technology
Clinical NLPLLMKG