Few-shot multi-token DreamBooth with LoRa for style-consistent character generation

πŸ“… 2025-10-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the challenge of generating stylistically consistent yet highly diverse novel characters from only a few reference examples, this paper proposes an efficient diffusion-based fine-tuning method. The approach decouples individual identity from shared artistic style via a multi-token clustering assignment mechanism; replaces class-specific regularization with random token augmentation to enhance diversity; and integrates LoRA for parameter-efficient adaptation within the DreamBooth text-to-image framework. Experiments on five specialized small-scale datasets demonstrate significant improvements over state-of-the-art baselines: superior quantitative performance (lower FID, lower LPIPS) and human evaluations confirming advantages in style fidelity, detail richness, and character novelty. The method enables unlimited, controllable style-aware character generation with minimal supervision.

Technology Category

Application Category

πŸ“ Abstract
The audiovisual industry is undergoing a profound transformation as it is integrating AI developments not only to automate routine tasks but also to inspire new forms of art. This paper addresses the problem of producing a virtually unlimited number of novel characters that preserve the artistic style and shared visual traits of a small set of human-designed reference characters, thus broadening creative possibilities in animation, gaming, and related domains. Our solution builds upon DreamBooth, a well-established fine-tuning technique for text-to-image diffusion models, and adapts it to tackle two core challenges: capturing intricate character details beyond textual prompts and the few-shot nature of the training data. To achieve this, we propose a multi-token strategy, using clustering to assign separate tokens to individual characters and their collective style, combined with LoRA-based parameter-efficient fine-tuning. By removing the class-specific regularization set and introducing random tokens and embeddings during generation, our approach allows for unlimited character creation while preserving the learned style. We evaluate our method on five small specialized datasets, comparing it to relevant baselines using both quantitative metrics and a human evaluation study. Our results demonstrate that our approach produces high-quality, diverse characters while preserving the distinctive aesthetic features of the reference characters, with human evaluation further reinforcing its effectiveness and highlighting the potential of our method.
Problem

Research questions and friction points this paper is trying to address.

Generating unlimited novel characters from few reference examples
Preserving artistic style consistency across generated characters
Overcoming limitations of text prompts in character detail capture
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-token strategy using clustering for character details
LoRA-based parameter-efficient fine-tuning adaptation
Random tokens and embeddings enable unlimited style-preserving generation
πŸ”Ž Similar Papers
No similar papers found.
R
Ruben Pascual
Institute of Smart Cities (ISC) and Department of Statistics, Computer Science and Mathematics, Public University of Navarre (UPNA), Campus Arrosadia, Pamplona, 31006, Navarre, Spain
M
Mikel Sesma-Sara
Institute of Smart Cities (ISC) and Department of Statistics, Computer Science and Mathematics, Public University of Navarre (UPNA), Campus Arrosadia, Pamplona, 31006, Navarre, Spain
Aranzazu Jurio
Aranzazu Jurio
Institute of Smart Cities (ISC) and Department of Statistics, Computer Science and Mathematics, Public University of Navarre (UPNA), Campus Arrosadia, Pamplona, 31006, Navarre, Spain
Daniel Paternain
Daniel Paternain
Department of Statistics, Computer Science and Mathematics
Artificial IntelligenceMachine LearningComputer Vision
Mikel Galar
Mikel Galar
Full Professor of Computer Science and Artificial Intelligence, Universidad PΓΊblica de Navarra
Artificial IntelligenceData MiningMachine LearningClassificationEnsembles