A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

📅 2025-04-12

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the lack of systematic surveys on reward modeling (RM) for aligning large language models (LLMs). We propose a unified taxonomy of RM and a three-dimensional analytical framework—“collect–model–use”—to holistically examine preference data construction, RM architectures, and downstream alignment applications. Synthesizing over 120 peer-reviewed studies, we construct a structured knowledge graph capturing methodological trends and empirical insights. Our analysis identifies six open challenges: scalability, generalization, alignment bias, reward hacking, data efficiency, and cross-task transferability—and distills four emerging research directions. Furthermore, we release an open-source, comprehensive resource repository encompassing benchmark datasets, reference implementations, and standardized evaluation protocols. This survey fills a critical gap in the RM literature, serving as both an accessible entry point for newcomers and a foundational reference for coordinated advancement in LLM alignment research.

Technology Category

Application Category

📝 Abstract

Reward Model (RM) has demonstrated impressive potential for enhancing Large Language Models (LLM), as RM can serve as a proxy for human preferences, providing signals to guide LLMs' behavior in various tasks. In this paper, we provide a comprehensive overview of relevant research, exploring RMs from the perspectives of preference collection, reward modeling, and usage. Next, we introduce the applications of RMs and discuss the benchmarks for evaluation. Furthermore, we conduct an in-depth analysis of the challenges existing in the field and dive into the potential research directions. This paper is dedicated to providing beginners with a comprehensive introduction to RMs and facilitating future studies. The resources are publicly available at githubfootnote{https://github.com/JLZhong23/awesome-reward-models}.

Problem

Research questions and friction points this paper is trying to address.

Surveying reward models for enhancing Large Language Models

Exploring preference collection, reward modeling, and usage

Analyzing challenges and future research directions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Surveying reward models for LLM enhancement

Exploring preference collection and reward modeling

Analyzing challenges and future research directions

🔎 Similar Papers

A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving