TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

Existing LLM-based ranking models rely on large-scale language models and explicit chain-of-thought (CoT) reasoning, incurring high computational overhead and latency, hindering practical deployment. To address this, we propose TFRank—a lightweight pointwise ranking model based on small LLMs (e.g., 1.7B parameters)—featuring a novel “reasoning-mode switching” mechanism. During training, TFRank jointly learns from CoT-annotated data and fine-grained relevance scores via multi-task learning; during inference, it bypasses implicit CoT generation and directly outputs relevance scores, effectively decoupling inference from reasoning. This design eliminates redundant text generation, drastically reducing latency and resource consumption. Experiments show that TFRank matches the performance of models with four times its parameter count on BRIGHT, while remaining highly competitive on BEIR. TFRank thus provides a practical, high-accuracy, low-overhead solution for real-world retrieval systems.

Technology Category

Application Category

📝 Abstract

Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress, but existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning, resulting in high computational cost and latency that limit real-world use. To address this, we propose extbf{TFRank}, an efficient pointwise reasoning ranker based on small-scale LLMs. To improve ranking performance, TFRank effectively integrates CoT data, fine-grained score supervision, and multi-task training. Furthermore, it achieves an efficient `` extbf{T}hink- extbf{F}ree" reasoning capability by employing a ``think-mode switch'' and pointwise format constraints. Specifically, this allows the model to leverage explicit reasoning during training while delivering precise relevance scores for complex queries at inference without generating any reasoning chains. Experiments show that TFRank (e.g., 1.7B) achieves performance comparable to models with four times more parameters on the BRIGHT benchmark, and demonstrates strong competitiveness on the BEIR benchmark. Further analysis shows that TFRank achieves an effective balance between performance and efficiency, providing a practical solution for integrating advanced reasoning into real-world systems. Our code and data are released in the repository: https://github.com/JOHNNY-fans/TFRank.

Problem

Research questions and friction points this paper is trying to address.

Reduces computational cost in LLM ranking models

Enhances ranking performance with efficient reasoning

Balances performance and efficiency for real-world use

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient pointwise ranker with small-scale LLMs

Integrates CoT data and fine-grained supervision

Think-free reasoning via mode switch and constraints

🔎 Similar Papers

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers