IMPersona: Evaluating Individual Level LM Impersonation

📅 2025-04-06

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This study investigates privacy and security risks arising from large language models’ (LLMs) emulation of individual writing styles and private knowledge. To address this, we propose IMPersona—a novel framework that systematically quantifies LLMs’ person-level anthropomorphism in stylistic and knowledge modeling for the first time. Our method introduces a hierarchical memory-augmented retrieval mechanism, integrated with supervised fine-tuning and blind conversational evaluation, achieving a 44.44% human misclassification rate on Llama-3.1-8B-Instruct—substantially outperforming the best prompt-based baseline (25.00%). Through rigorous human evaluation and adversarial detection analysis, we demonstrate that even small-scale models can achieve high human confusion after lightweight fine-tuning. Furthermore, we propose deployable detection metrics and practical defense strategies, providing empirical foundations and technical pathways for ethical governance of personalized AI systems.

Technology Category

Application Category

📝 Abstract

As language models achieve increasingly human-like capabilities in conversational text generation, a critical question emerges: to what extent can these systems simulate the characteristics of specific individuals? To evaluate this, we introduce IMPersona, a framework for evaluating LMs at impersonating specific individuals' writing style and personal knowledge. Using supervised fine-tuning and a hierarchical memory-inspired retrieval system, we demonstrate that even modestly sized open-source models, such as Llama-3.1-8B-Instruct, can achieve impersonation abilities at concerning levels. In blind conversation experiments, participants (mis)identified our fine-tuned models with memory integration as human in 44.44% of interactions, compared to just 25.00% for the best prompting-based approach. We analyze these results to propose detection methods and defense strategies against such impersonation attempts. Our findings raise important questions about both the potential applications and risks of personalized language models, particularly regarding privacy, security, and the ethical deployment of such technologies in real-world contexts.

Problem

Research questions and friction points this paper is trying to address.

Evaluate LM ability to impersonate specific individuals' writing style

Assess risks of LM impersonation for privacy and security

Develop detection methods against LM impersonation attempts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Supervised fine-tuning for impersonation

Hierarchical memory-inspired retrieval system

Detection methods for impersonation attempts

🔎 Similar Papers

Human Simulacra: Benchmarking the Personification of Large Language Models

2024-02-28Citations: 1

Discovering Spoofing Attempts on Language Model Watermarks

2024-10-03Citations: 2