AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Designers urgently require efficient extraction of standardized, front-facing, and reusable design assets from open-scene images; however, existing generative models struggle to simultaneously ensure high fidelity, orthogonality (i.e., canonical viewpoint alignment), and robustness—particularly under occlusion and complex viewpoints. This paper introduces the first generative framework specifically tailored for design asset extraction. It innovatively proposes an inverse-paste mechanism to construct a reward model, enabling closed-loop reinforcement optimization that substantially mitigates hallucination and improves prompt adherence. Built upon a diffusion architecture, the method leverages over 200K synthetic image–subject pairs for pretraining, is rigorously evaluated on real-world benchmarks, and undergoes closed-loop reinforcement fine-tuning. Experiments demonstrate state-of-the-art performance in design asset extraction, yielding high-fidelity, orthogonally aligned, and editable outputs. The framework has been successfully validated within real-world design workflows.

Technology Category

Application Category

📝 Abstract
Recent research on generative models has primarily focused on creating product-ready visual outputs; however, designers often favor access to standardized asset libraries, a domain that has yet to be significantly enhanced by generative capabilities. Although open-world scenes provide ample raw materials for designers, efficiently extracting high-quality, standardized assets remains a challenge. To address this, we introduce AssetDropper, the first framework designed to extract assets from reference images, providing artists with an open-world asset palette. Our model adeptly extracts a front view of selected subjects from input images, effectively handling complex scenarios such as perspective distortion and subject occlusion. We establish a synthetic dataset of more than 200,000 image-subject pairs and a real-world benchmark with thousands more for evaluation, facilitating the exploration of future research in downstream tasks. Furthermore, to ensure precise asset extraction that aligns well with the image prompts, we employ a pre-trained reward model to fulfill a closed-loop with feedback. We design the reward model to perform an inverse task that pastes the extracted assets back into the reference sources, which assists training with additional consistency and mitigates hallucination. Extensive experiments show that, with the aid of reward-driven optimization, AssetDropper achieves the state-of-the-art results in asset extraction. Project page: AssetDropper.github.io.
Problem

Research questions and friction points this paper is trying to address.

Extracting standardized assets from open-world scenes efficiently
Handling perspective distortion and occlusion in asset extraction
Ensuring precise asset alignment with image prompts via feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for asset extraction
Employs reward-driven optimization for precision
Creates synthetic dataset for training evaluation
🔎 Similar Papers
No similar papers found.
Lanjiong Li
Lanjiong Li
The Hong Kong University of Science and Technology (Guangzhou))
Generative Model
G
Guanhua Zhao
School of Electronic and Computer Engineering, Peking University, China
Lingting Zhu
Lingting Zhu
The University of Hong Kong
Generative ModelsComputer Vision
Zeyu Cai
Zeyu Cai
Institute of Heavy Ion Physics, Peking University
AI for SciencePlasma PhysicsAI AgentsNumber Theory
Lequan Yu
Lequan Yu
Assistant Professor, The University of Hong Kong
Medical Image AnalysisMultimodal LearningComputational PathologyAI for Healthcare
J
Jian Zhang
School of Electronic and Computer Engineering, Peking University, China
Z
Zeyu Wang
The Hong Kong University of Science and Technology (Guangzhou), China and The Hong Kong University of Science and Technology, China