Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Existing video personalization methods struggle with multi-concept co-generation and often suffer from identity ambiguity. This paper introduces the first fine-tuning-free framework for multi-concept video personalization, enabling joint generation of a single personalized video driven by multiple reference images—e.g., faces, bodies, or animals. Our approach features three core innovations: (1) an anchor prompt mechanism that encodes each reference image into a unique textual token; (2) concept-order embedding to explicitly model structural relationships among heterogeneous inputs; and (3) a diffusion-based text-to-video architecture integrating image-anchor embeddings, order-aware encoding, and cross-modal alignment. Experiments demonstrate substantial improvements in identity fidelity and visual quality under multi-concept scenarios, outperforming state-of-the-art methods. The framework supports flexible, arbitrary combinations of reference concepts without retraining, enabling robust and scalable video personalization.

Technology Category

Application Category

📝 Abstract

Video personalization, which generates customized videos using reference images, has gained significant attention. However, prior methods typically focus on single-concept personalization, limiting broader applications that require multi-concept integration. Attempts to extend these models to multiple concepts often lead to identity blending, which results in composite characters with fused attributes from multiple sources. This challenge arises due to the lack of a mechanism to link each concept with its specific reference image. We address this with anchored prompts, which embed image anchors as unique tokens within text prompts, guiding accurate referencing during generation. Additionally, we introduce concept embeddings to encode the order of reference images. Our approach, Movie Weaver, seamlessly weaves multiple concepts-including face, body, and animal images-into one video, allowing flexible combinations in a single model. The evaluation shows that Movie Weaver outperforms existing methods for multi-concept video personalization in identity preservation and overall quality.

Problem

Research questions and friction points this paper is trying to address.

Multi-concept video personalization challenges

Identity blending in multi-concept integration

Anchored prompts for accurate referencing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Anchored prompts for accurate referencing

Concept embeddings encode reference order

Seamless multi-concept video personalization

🔎 Similar Papers

No similar papers found.