Persistent Story World Simulation with Continuous Character Customization

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing story visualization methods struggle to simultaneously achieve precise character customization, semantic consistency, and continual integration of new characters. To address this challenge, this work proposes EverTale, a story world simulator that leverages a unified LoRA module for efficient continual character adaptation. EverTale introduces three core mechanisms: an integrated character integrator, a chain-of-thought reasoning quality gate based on multimodal large language models (MLLMs), and a character-aware region-focused sampling strategy. The proposed approach significantly outperforms current methods in both single- and multi-character story generation tasks, effectively mitigating identity degradation and layout conflicts while enabling high-quality, coherent, and scalable story visualization.

Technology Category

Application Category

📝 Abstract
Story visualization has gained increasing attention in computer vision. However, current methods often fail to achieve a synergy between accurate character customization, semantic alignment, and continuous integration of new identities. To tackle this challenge, in this paper we present EverTale, a story world simulator for continuous story character customization. We first propose an All-in-One-World Character Integrator to achieve continuous character adaptation within unified LoRA module, eliminating the need for per-character optimization modules of previous methods. Then, we incorporate a Character Quality Gate via MLLM-as-Judge to ensure the fidelity of each character adaptation process through chain-of-thought reasoning, determining whether the model can proceed to the next character or require additional training on the current one. We also introduce a Character-Aware Region-Focus Sampling strategy to address the identity degradation and layout conflicts in existing multi-character visual storytelling, ensuring natural multi-character generation by harmonizing local character-specific details with global scene context with higher efficiency. Experimental results show that our EverTale achieves superior performance against a wider range of compared methods on both single- and multi-character story visualization. Codes will be available.
Problem

Research questions and friction points this paper is trying to address.

story visualization
character customization
identity integration
multi-character generation
semantic alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

continuous character customization
LoRA-based integration
MLLM-as-Judge
multi-character visual storytelling
region-focus sampling
🔎 Similar Papers
No similar papers found.
J
Jinlu Zhang
Xiamen University
Q
Qiyun Wang
Xiamen University
B
Baoxiang Du
Xiamen University
Jiayi Ji
Jiayi Ji
Rutgers University
Jing He
Jing He
The Hong Kong University of Science and Technology, Guangzhou
Generative ModelsImage Generation
Rongsheng Zhang
Rongsheng Zhang
Fuxi AI Lab, NetEase Inc., Hangzhou, China
NLP
Tangjie Lv
Tangjie Lv
netease
reinforcement learning
X
Xiaoshuai Sun
Xiamen University
R
Rongrong Ji
Xiamen University