Large Model Empowered Metaverse: State-of-the-Art, Challenges and Opportunities

📅 2025-01-18

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Metaverse applications face critical bottlenecks including high real-time rendering latency, poor adaptability to dynamic scenes, and limited scalability. To address these challenges, this paper proposes a large language model (LLM)-empowered cloud-edge-device collaborative generative AI rendering framework. It introduces two key innovations: (1) a mobility-aware pre-rendering mechanism that anticipates user movement for proactive resource allocation, and (2) a diffusion model–driven adaptive rendering strategy that dynamically optimizes visual fidelity and computational load based on scene complexity and device capabilities. The framework tightly integrates LLMs, video foundation models (e.g., Sora), and hierarchical distributed computing across cloud, edge, and end devices. Experimental evaluation demonstrates a 37% reduction in end-to-end rendering latency and significantly enhanced real-time immersion under high-concurrency, highly dynamic conditions. This work establishes a scalable, generative-AI-native technical pathway for next-generation metaverse systems.

Technology Category

Application Category

📝 Abstract

The Metaverse represents a transformative shift beyond traditional mobile Internet, creating an immersive, persistent digital ecosystem where users can interact, socialize, and work within 3D virtual environments. Powered by large models such as ChatGPT and Sora, the Metaverse benefits from precise large-scale real-world modeling, automated multimodal content generation, realistic avatars, and seamless natural language understanding, which enhance user engagement and enable more personalized, intuitive interactions. However, challenges remain, including limited scalability, constrained responsiveness, and low adaptability in dynamic environments. This paper investigates the integration of large models within the Metaverse, examining their roles in enhancing user interaction, perception, content creation, and service quality. To address existing challenges, we propose a generative AI-based framework for optimizing Metaverse rendering. This framework includes a cloud-edge-end collaborative model to allocate rendering tasks with minimal latency, a mobility-aware pre-rendering mechanism that dynamically adjusts to user movement, and a diffusion model-based adaptive rendering strategy to fine-tune visual details. Experimental results demonstrate the effectiveness of our approach in enhancing rendering efficiency and reducing rendering overheads, advancing large model deployment for a more responsive and immersive Metaverse.

Problem

Research questions and friction points this paper is trying to address.

Enhancing Metaverse scalability and responsiveness with large models

Optimizing rendering efficiency using generative AI-based framework

Improving dynamic environment adaptability in Metaverse systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cloud-edge-end collaborative rendering model

Mobility-aware pre-rendering mechanism

Diffusion model-based adaptive rendering strategy

🔎 Similar Papers

Surveying the MLLM Landscape: A Meta-Review of Current Surveys