CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards

📅 2025-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current role-playing language agents (RPLAs) predominantly rely on prompt engineering or supervised fine-tuning, neglecting the underlying cognitive mechanisms governing agent behavior. To address this, we propose CogDual, the first framework to instantiate a “cognition-first, response-second” paradigm, integrating external situational awareness with internal self-cognition and incorporating dual-process modeling inspired by cognitive psychology. Methodologically, we design a reinforcement learning framework with implicit rule-based rewards, enabling cognition-consistent optimization without human annotations; further, we jointly leverage prompt engineering, supervised fine-tuning, and reinforcement learning for end-to-end optimization in open-domain text generation. Evaluations on benchmarks—including CoSER, Cross-MR, and LifeChoice—demonstrate that CogDual significantly improves role behavioral consistency and contextual alignment, while exhibiting superior generalization over existing approaches.

Technology Category

Application Category

📝 Abstract
Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs). Existing approaches typically rely on prompt engineering or supervised fine-tuning to enable models to imitate character behaviors in specific scenarios, but often neglect the underlying emph{cognitive} mechanisms driving these behaviors. Inspired by cognitive psychology, we introduce extbf{CogDual}, a novel RPLA adopting a extit{cognize-then-respond } reasoning paradigm. By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment. To further optimize the performance, we employ reinforcement learning with two general-purpose reward schemes designed for open-domain text generation. Extensive experiments on the CoSER benchmark, as well as Cross-MR and LifeChoice, demonstrate that CogDual consistently outperforms existing baselines and generalizes effectively across diverse role-playing tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing dual cognition in LLMs via reinforcement learning
Improving character consistency and contextual alignment in RPLAs
Generalizing performance across diverse role-playing tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning with implicit rule-based rewards
Joint modeling of external and internal awareness
Cognize-then-respond reasoning paradigm
C
Cheng Liu
Hunyuan AI Digital Human, Tencent, Shenzhen, China
Yifei Lu
Yifei Lu
Northeastern University, Shenyang, China
Large Language Model
Fanghua Ye
Fanghua Ye
University College London
Conversational AIAI AssistantsGraphNLPLLM
J
Jian Li
Hunyuan AI Digital Human, Tencent, Shenzhen, China
X
Xingyu Chen
Hunyuan AI Digital Human, Tencent, Shenzhen, China
Feiliang Ren
Feiliang Ren
Northeastern University
machine translationtext mining
Zhaopeng Tu
Zhaopeng Tu
Tech Lead @ Tencent Digital Human
Digital HumanAgentsLarge Language ModelsMachine Translation
X
Xiaolong Li
Hunyuan AI Digital Human, Tencent, Shenzhen, China