The Unlearnability Phenomenon in RLVR for Language Models

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This study reveals a class of inherently unlearnable hard examples in reinforcement learning with verifiable rewards (RLVR): even when correct reasoning trajectories exist, language models struggle to master them. Through cross-sample gradient analysis, representation similarity assessment, and data augmentation experiments, the authors identify the root cause as an intrinsic representational deficiency—these samples exhibit low gradient similarity with others and lack generalizable reasoning patterns. The work provides the first systematic characterization of unlearnable data in RLVR, demonstrating that current optimization and sampling strategies fail to mitigate this issue. These findings expose a fundamental limitation of existing reinforcement learning approaches on complex reasoning tasks and offer crucial theoretical insights for future algorithm design.

📝 Abstract

Reinforcement Learning with Verifiable Reward (RLVR) has proven effective in improving Large Language Model's (LLM) reasoning ability. However, the learning dynamics of RLVR remain underexplored. In this paper, we reveal a counterintuitive phenomenon: among hard examples that the model initially struggles with, a substantial subset remains unlearnable even when correct rollouts are present. To understand the phenomenon, we first demonstrate that existing optimization and sampling techniques fail to resolve unlearnability. With cross-example gradient analysis, we show that unlearnable examples have fundamental representation issue, characterized by low gradient similarity with the rest of the examples and ungeneralizable reasoning patterns. We further show that representation flaws are difficult to mitigate in RL, as data augmentation does not improve gradient similarity. Our study provides the first systematic characterization of unlearnable data in RLVR training and reveals fundamental limitations in current RL approaches for reasoning tasks. Code and data are available at \url{https://github.com/yulinchen99/unlearnability-rlvr}.

Problem

Research questions and friction points this paper is trying to address.

Unlearnability

Reinforcement Learning

Language Models

Reasoning

Gradient Similarity

Innovation

Methods, ideas, or system contributions that make the work stand out.

unlearnability

RLVR

gradient similarity