Multiverse: Language-Conditioned Multi-Game Level Blending via Shared Representation

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing text-to-level generation methods are largely confined to individual games, struggling to achieve structured cross-game integration with linguistic control. This work proposes a unified framework that aligns natural language instructions with multi-game level structures through a shared latent space and introduces a threshold-based multi-positive contrastive learning mechanism to establish cross-game semantic correspondences. For the first time, the approach enables language-guided fusion of level structures across multiple games, supporting zero-shot generation from compositional textual prompts and controllable interpolation-based mixing. Experimental results demonstrate that the framework significantly improves the quality of level fusion within game genres and successfully achieves controllable, compositional level generation across distinct games.

Technology Category

Application Category

📝 Abstract

Text-to-level generation aims to translate natural language descriptions into structured game levels, enabling intuitive control over procedural content generation. While prior text-to-level generators are typically limited to a single game domain, extending language-conditioned generation to multiple games requires learning representations that capture structural relationships across domains. We propose Multiverse, a language-conditioned multi-game level generator that enables cross-game level blending through textual specifications. The model learns a shared latent space aligning textual instructions and level structures, while a threshold-based multi-positive contrastive supervision links semantically related levels across games. This representation allows language to guide which structural characteristics should be preserved when combining content from different games, enabling controllable blending through latent interpolation and zero-shot generation from compositional textual prompts. Experiments show that the learned representation supports controllable cross-game level blending and significantly improves blending quality within the same game genre, while providing a unified representation for language-conditioned multi-game content generation.

Problem

Research questions and friction points this paper is trying to address.

text-to-level generation

multi-game

level blending

cross-game

procedural content generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

text-to-level generation

multi-game level blending

shared latent space