Interactive AI NPCs Powered by LLMs: Technical Report for the CPDC Challenge 2025

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the weak commonsense reasoning and poor role consistency exhibited by large language model (LLM)-driven AI non-player characters (NPCs) in interactive dialogue. To this end, we propose an interactive role modeling framework that synergistically integrates context engineering with reinforcement learning. Methodologically, we introduce dynamic tool pruning and persona-aware feature trimming to achieve efficient context compression; replace supervised fine-tuning with reward-guided Generalized Reinforcement Policy Optimization (GRPO) to enhance task-oriented behavior while mitigating few-shot overfitting. The approach incorporates systematic engineering improvements—including prompt optimization, streamlined tool invocation, parameter normalization, and function merging. Evaluated on the CPDC 2025 Challenge, our solution ranked first in Task 2 (API), second in Task 1 (API), and third in both Task 3 (API) and the GPU track, demonstrating substantial gains in commonsense reasoning, role consistency, and cross-task generalization.

Technology Category

Application Category

📝 Abstract
This report presents the solution and results of our team MSRA_SC in the Commonsense Persona-Grounded Dialogue Challenge (CPDC 2025). We propose a simple yet effective framework that unifies improvements across both GPU Track and API Track. Our method centers on two key components. First, Context Engineering applies dynamic tool pruning and persona clipping for input compression, combined with post-processing techniques such as parameter normalization and function merging. Together with manually refined prompts, this design improves tool call stability, execution reliability, and role-playing guidance. Second, in the GPU Track, we further adopt GRPO training, replacing supervised fine-tuning with reinforcement learning directly optimized by reward signals. This mitigates small-sample overfitting and significantly enhances task-oriented dialogue performance. In the final evaluation, our team ranks 1st in Task 2 API, 2nd in Task 1 API, and 3rd in both Task 3 API and GPU track, demonstrating the effectiveness of our approach. Our code is publicly available at https://gitlab.aicrowd.com/nikoo_yu/cpdc-2025-winning-solution
Problem

Research questions and friction points this paper is trying to address.

Improving tool call stability and execution reliability through context engineering
Enhancing task-oriented dialogue performance with GRPO reinforcement learning
Addressing small-sample overfitting in persona-grounded dialogue systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Context Engineering with dynamic pruning and persona clipping
GRPO training replaces supervised fine-tuning with reinforcement learning
Parameter normalization and function merging enhance execution reliability
🔎 Similar Papers
No similar papers found.