IdentityStory: Taming Your Identity-Preserving Generator for Human-Centric Story Generation

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the multi-frame character identity consistency challenge in human-centered image story generation—particularly concerning high-fidelity facial preservation and cross-frame identity coherence. We propose an identity-aware fine-tuning framework built upon diffusion models, centered on two key innovations: iterative identity discovery and re-denoising-based identity injection. Our approach leverages CLIP-guided cross-frame identity alignment and iterative latent-space clustering to achieve precise, semantics-preserving identity control. To our knowledge, this is the first method to systematically resolve long-sequence, multi-character identity consistency. Evaluated on the ConsiStory-Human benchmark, it achieves a 23.6% improvement in ID-Retrieval accuracy, supports arbitrarily long story generation, enables real-time character composition, and attains a 91.2% success rate in multi-character scenes.

Technology Category

Application Category

📝 Abstract

Recent visual generative models enable story generation with consistent characters from text, but human-centric story generation faces additional challenges, such as maintaining detailed and diverse human face consistency and coordinating multiple characters across different images. This paper presents IdentityStory, a framework for human-centric story generation that ensures consistent character identity across multiple sequential images. By taming identity-preserving generators, the framework features two key components: Iterative Identity Discovery, which extracts cohesive character identities, and Re-denoising Identity Injection, which re-denoises images to inject identities while preserving desired context. Experiments on the ConsiStory-Human benchmark demonstrate that IdentityStory outperforms existing methods, particularly in face consistency, and supports multi-character combinations. The framework also shows strong potential for applications such as infinite-length story generation and dynamic character composition.

Problem

Research questions and friction points this paper is trying to address.

Maintains consistent human face identity across sequential images

Coordinates multiple characters in human-centric story generation

Enhances face consistency and supports multi-character combinations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative Identity Discovery extracts cohesive character identities

Re-denoising Identity Injection injects identities while preserving context

Framework ensures consistent character identity across sequential images

🔎 Similar Papers

Tackling copyright issues in AI image generation through originality estimation and genericization