Enhancing Physical Consistency in Lightweight World Models

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

To address the trade-off between weak physical consistency in lightweight world models and the deployment difficulty of large models, this paper proposes a compact BEV-based world model. Methodologically: (1) a soft-mask training mechanism is designed to enhance physical interaction modeling for dynamic objects; (2) a zero-shot-compatible warm-start inference strategy is introduced to improve prediction stability and convergence efficiency. Experiments show that, at equal parameter counts, our method achieves a 60.6% higher weighted composite performance over baselines; even the smallest variant (130M parameters) outperforms baselines by 7.4%, with a 28% inference speedup. The core contribution lies in the first integration of soft masking and warm-start inference into a BEV world model framework—enabling substantial computational savings while preserving high-fidelity physical dynamics modeling.

Technology Category

Application Category

📝 Abstract

A major challenge in deploying world models is the trade-off between size and performance. Large world models can capture rich physical dynamics but require massive computing resources, making them impractical for edge devices. Small world models are easier to deploy but often struggle to learn accurate physics, leading to poor predictions. We propose the Physics-Informed BEV World Model (PIWM), a compact model designed to efficiently capture physical interactions in bird's-eye-view (BEV) representations. PIWM uses Soft Mask during training to improve dynamic object modeling and future prediction. We also introduce a simple yet effective technique, Warm Start, for inference to enhance prediction quality with a zero-shot model. Experiments show that at the same parameter scale (400M), PIWM surpasses the baseline by 60.6% in weighted overall score. Moreover, even when compared with the largest baseline model (400M), the smallest PIWM (130M Soft Mask) achieves a 7.4% higher weighted overall score with a 28% faster inference speed.

Problem

Research questions and friction points this paper is trying to address.

Balancing model size and physical accuracy

Improving lightweight world model physics consistency

Enhancing edge device deployment with efficient predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Informed BEV World Model

Soft Mask training technique

Warm Start inference method

🔎 Similar Papers

Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI