FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of **unbounded, cumulative surface growth**—a dynamic geometric modeling problem for physics-aware world models—enabling surface evolution prediction and active guidance from partial, multimodal sensory observations. We propose **Modality-Agnostic Growth Embedding (MAGE)** and a **Differentiable Accretion Graph Network (AGN)**, the first framework supporting cross-modal inputs, long-horizon temporal modeling, and physically consistent evolution inference under sensor occlusion or failure. Our method integrates a physics-informed predictor, age-aware positional encoding, energy-gated message passing, and a geometry-correspondence fusion mechanism. Evaluated on the custom SURF-GARDEN simulator and SURF-BENCH benchmark, it outperforms specialized baselines across six core tasks—including topology identification and inverse material estimation—and four robustness stress tests, demonstrating superior generalization and interference resilience in dynamic physical environments.

Technology Category

Application Category

📝 Abstract
Physical intelligence -- anticipating and shaping the world from partial, multisensory observations -- is critical for next-generation world models. We propose FOLIAGE, a physics-informed multimodal world model for unbounded accretive surface growth. In its Action-Perception loop, a unified context encoder maps images, mesh connectivity, and point clouds to a shared latent state. A physics-aware predictor, conditioned on physical control actions, advances this latent state in time to align with the target latent of the surface, yielding a Modality-Agnostic Growth Embedding (MAGE) that interfaces with critic heads for downstream objectives. FOLIAGE's Accretive Graph Network (AGN) captures dynamic connectivity through Age Positional Encoding and Energy-Gated Message-Passing. Geometry-Correspondence Fusion and Cross-Patch Masking enhance MAGE's expressiveness, while Hierarchical Pooling balances global context with local dynamics. We create SURF-GARDEN, a world model learning platform comprising a Counterfactual Physics Simulator, a Multimodal Correspondence Extractor, and Evolution Tracing, which generates 7,200 diverse surface-growth sequences. SURF-BENCH, our physical-intelligence evaluation suite, evaluates six core tasks -- topology recognition, inverse material estimation, growth-stage classification, latent roll-out, cross-modal retrieval, and dense correspondence -- and four stress tests -- sensor dropout, zero-shot modality transfer, long-horizon prediction, and physics ablation -- to probe resilience. FOLIAGE outperforms specialized baselines while remaining robust across dynamic environments, establishing a new world-model based, multimodal pathway to physical intelligence.
Problem

Research questions and friction points this paper is trying to address.

Develop a physics-informed multimodal world model for unbounded surface growth
Enhance physical intelligence via unified context encoding and physics-aware prediction
Evaluate model robustness across diverse tasks and dynamic environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-informed multimodal world model
Modality-Agnostic Growth Embedding (MAGE)
Accretive Graph Network (AGN)
🔎 Similar Papers
No similar papers found.