SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents

📅 2024-11-12

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This paper addresses the “interaction hallucination” problem in large language models (LLMs) during role-playing social interactions—i.e., the generation of factually inconsistent or role-incongruent content due to stance drift. We propose the first explicit, generalizable evaluation paradigm for this issue. We formally define interaction hallucination and introduce SHARP, a benchmark jointly driven by commonsense knowledge graphs and role stance modeling. SHARP enables the first quantitative trade-off analysis between role fidelity and factual consistency. It supports stable evaluation across 12 distinct worldviews, significantly improving hallucination detection accuracy and cross-role generalization. Through systematic analysis, SHARP identifies seven critical influencing factors. Our work establishes both a theoretical foundation and an evaluation infrastructure for developing trustworthy role-playing agents.

Technology Category

Application Category

📝 Abstract

The advanced role-playing capabilities of Large Language Models (LLMs) have paved the way for developing Role-Playing Agents (RPAs). However, existing benchmarks in social interaction such as HPD and SocialBench have not investigated hallucination and face limitations like poor generalizability and implicit judgments for character fidelity. To address these issues, we propose a generalizable, explicit and effective paradigm to unlock the interactive patterns in diverse worldviews. Specifically, we define the interactive hallucination based on stance transfer and construct a benchmark, SHARP, by extracting relations from a general commonsense knowledge graph and leveraging the inherent hallucination properties of RPAs to simulate interactions across roles. Extensive experiments validate the effectiveness and stability of our paradigm. Our findings further explore the factors influencing these metrics and discuss the trade-off between blind loyalty to roles and adherence to facts in RPAs.

Problem

Research questions and friction points this paper is trying to address.

Defining interactive hallucination via stance transfer in LLMs

Addressing poor generalizability in role-playing LLM interactions

Challenging conventional hallucination mitigation solutions in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Define interactive hallucination via stance transfer

Construct SHARP benchmark using knowledge graphs

Challenge conventional hallucination mitigation solutions

🔎 Similar Papers

No similar papers found.