HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns

📅 2026-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that current large language models struggle to authentically simulate human cognition and behavior in role-playing scenarios. We propose the first framework that models psychological traits as interacting causal forces, constructing 244 psychologically grounded modes from academic literature and synthesizing 11,359 multimodal interaction scenarios—each featuring conflict or reinforcement—to generate internal thoughts and behavioral expressions through multi-turn dialogues. A two-tiered human evaluation protocol is introduced to separately assess single-mode fidelity and cross-mode dynamic consistency, revealing that existing metrics often conflate simulation accuracy with social desirability. Our model, HUMANLLM-8B, outperforms Qwen3-32B—a model four times larger—in multimodal dynamic tasks, achieving a human alignment correlation coefficient of 0.91, thereby demonstrating that fine-grained cognitive modeling is essential for genuine anthropomorphic simulation.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and generation, serving as the foundation for advanced persona simulation and Role-Playing Language Agents (RPLAs). However, achieving authentic alignment with human cognitive and behavioral patterns remains a critical challenge for these agents. We present HumanLLM, a framework treating psychological patterns as interacting causal forces. We construct 244 patterns from ~12,000 academic papers and synthesize 11,359 scenarios where 2-5 patterns reinforce, conflict, or modulate each other, with multi-turn conversations expressing inner thoughts, actions, and dialogue. Our dual-level checklists evaluate both individual pattern fidelity and emergent multi-pattern dynamics, achieving strong human alignment (r=0.91) while revealing that holistic metrics conflate simulation accuracy with social desirability. HumanLLM-8B outperforms Qwen3-32B on multi-pattern dynamics despite 4x fewer parameters, demonstrating that authentic anthropomorphism requires cognitive modeling--simulating not just what humans do, but the psychological processes generating those behaviors.
Problem

Research questions and friction points this paper is trying to address.

anthropomorphism
cognitive patterns
human alignment
role-playing agents
psychological modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

anthropomorphism
cognitive modeling
psychological patterns
multi-pattern dynamics
role-playing language agents
🔎 Similar Papers
No similar papers found.
Xintao Wang
Xintao Wang
Fudan University
Role-Playing AgentsLanguage AgentsLarge Language ModelsKnowledge Graphs
J
Jian Yang
Fudan University
Weiyuan Li
Weiyuan Li
Alibaba Group
RLLLMAgent
R
Rui Xie
Fudan University
J
Jen-tse Huang
Johns Hopkins University
J
Jun Gao
Hello Group
S
Shuai Huang
Hello Group
L
Liyuan Gou
Fudan University
Hongwei Feng
Hongwei Feng
Fudan University
knowledge management,AI,big data
Y
Yanghua Xiao
Fudan University