ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing generative models struggle to model individualized user preferences due to the scarcity of fine-grained, real-world preference annotations. Method: We introduce the first large-scale in-the-wild image generation interaction dataset—comprising 570K LoRA-customized, text-prompted, image-based preference triplets—and propose a novel paradigm for personalizing generative models grounded in authentic user interactions. We design an end-to-end latent-weight-space editing framework that jointly integrates behavioral log modeling, preference alignment, cross-modal retrieval, and vision-language understanding to enable personalized image retrieval and generation recommendation. Contribution/Results: Experiments demonstrate significant improvements in preference alignment accuracy and generation recommendation quality, validating both the utility of our dataset and the generalizability of the proposed paradigm. The framework advances personalized generative modeling by bridging real-user interaction signals with latent-space adaptation in multimodal foundation models.

Technology Category

Application Category

📝 Abstract

We introduce ImageGem, a dataset for studying generative models that understand fine-grained individual preferences. We posit that a key challenge hindering the development of such a generative model is the lack of in-the-wild and fine-grained user preference annotations. Our dataset features real-world interaction data from 57K users, who collectively have built 242K customized LoRAs, written 3M text prompts, and created 5M generated images. With user preference annotations from our dataset, we were able to train better preference alignment models. In addition, leveraging individual user preference, we investigated the performance of retrieval models and a vision-language model on personalized image retrieval and generative model recommendation. Finally, we propose an end-to-end framework for editing customized diffusion models in a latent weight space to align with individual user preferences. Our results demonstrate that the ImageGem dataset enables, for the first time, a new paradigm for generative model personalization.

Problem

Research questions and friction points this paper is trying to address.

Lack of in-the-wild fine-grained user preference annotations

Personalized image retrieval and generative model recommendation

Aligning customized diffusion models with individual user preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset with real-world user interaction data

Framework for editing customized diffusion models

Training better preference alignment models

🔎 Similar Papers

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation