Character-R1: Enhancing Role-Aware Reasoning in Role-Playing Agents via RLVR

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing role-playing agents often exhibit behavior inconsistent with their assigned roles in complex scenarios due to a lack of internal cognitive coherence. This work proposes a Reinforcement Learning–based Verifiable Reward framework (RLVR) that introduces three novel mechanisms—cognitive focus reward, reference-guided reward, and role-conditioned reward normalization—to enable, for the first time, structured modeling and optimization of multidimensional cognitive elements such as knowledge and memory. By explicitly aligning agent behavior with role-specific cognitive attributes, RLVR significantly enhances both role consistency and expressive performance, outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Current role-playing agents (RPAs) are typically constructed by imitating surface-level behaviors, but this approach lacks internal cognitive consistency, often causing out-of-character errors in complex situations. To address this, we propose Character-R1, a framework designed to provide comprehensive verifiable reward signals for effective role-aware reasoning, which are missing in recent studies. Specifically, our framework comprises three core designs: (1) Cognitive Focus Reward, which enforces explicit label-based analysis of 10 character elements (e.g., worldview) to structure internal cognition; (2) Reference-Guided Reward, which utilizes overlap-based metrics with reference responses as optimization anchors to enhance exploration and performance; and (3) Character-Conditioned Reward Normalization, which adjusts reward distributions based on character categories to ensure robust optimization across heterogeneous roles. Extensive experiments demonstrate that Character-R1 significantly outperforms existing methods in knowledge, memory and others.
Problem

Research questions and friction points this paper is trying to address.

role-playing agents
cognitive consistency
out-of-character errors
role-aware reasoning
character modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Role-Aware Reasoning
Reinforcement Learning with Verifiable Rewards (RLVR)
Cognitive Focus Reward
Reference-Guided Reward
Character-Conditioned Reward Normalization