Human-Aligned Procedural Level Generation Reinforcement Learning via Text-Level-Sketch Shared Representation

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Existing PCGRL methods struggle to model human design intent, limiting their practical integration into creative workflows. To address this, we propose VIPCGRL—a novel framework that establishes, for the first time, a shared multimodal embedding space jointly representing textual descriptions, level layouts, and hand-drawn sketches. It employs quaternary contrastive learning and cross-modal style alignment to enable human-style-aware generative control. The method integrates deep reinforcement learning with multimodal semantic alignment and a similarity-based auxiliary reward mechanism, supporting joint policy optimization over heterogeneous inputs (text, sketch, layout). Experiments demonstrate that VIPCGRL significantly outperforms state-of-the-art baselines: it achieves a +23.6% improvement in style consistency score and attains an 87.4% expert preference rate in human evaluation. These results substantiate its effectiveness in enhancing human-AI collaboration and creative utility in procedural level generation.

Technology Category

Application Category

📝 Abstract

Human-aligned AI is a critical component of co-creativity, as it enables models to accurately interpret human intent and generate controllable outputs that align with design goals in collaborative content creation. This direction is especially relevant in procedural content generation via reinforcement learning (PCGRL), which is intended to serve as a tool for human designers. However, existing systems often fall short of exhibiting human-centered behavior, limiting the practical utility of AI-driven generation tools in real-world design workflows. In this paper, we propose VIPCGRL (Vision-Instruction PCGRL), a novel deep reinforcement learning framework that incorporates three modalities-text, level, and sketches-to extend control modality and enhance human-likeness. We introduce a shared embedding space trained via quadruple contrastive learning across modalities and human-AI styles, and align the policy using an auxiliary reward based on embedding similarity. Experimental results show that VIPCGRL outperforms existing baselines in human-likeness, as validated by both quantitative metrics and human evaluations. The code and dataset will be available upon publication.

Problem

Research questions and friction points this paper is trying to address.

Aligning AI-generated content with human intent in PCGRL

Enhancing human-likeness in procedural level generation

Extending control modalities via text-level-sketch shared representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal DRL framework with text-level-sketch

Quadruple contrastive learning for shared embedding

Embedding similarity-based auxiliary reward alignment

🔎 Similar Papers

No similar papers found.