PhysiX: A Foundation Model for Physics Simulations

📅 2025-06-21

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Physical simulation faces critical bottlenecks—including severe data scarcity (tens of thousands of samples), difficulty in long-horizon prediction, and complexity in multi-scale modeling—that hinder the adoption of large language models. To address these challenges, we propose the first foundation model specifically designed for physical simulation: it discretizes continuous physical processes into token sequences and introduces a dedicated refinement module to mitigate quantization error. Our model employs a 4.5-billion-parameter autoregressive architecture that models dynamics directly in token space, jointly trained with natural video knowledge to enhance spatiotemporal understanding. This framework achieves the first general-purpose generative physical simulation, outperforming prior state-of-the-art methods on the The Well benchmark. Moreover, it demonstrates significant gains under few-shot settings—surpassing task-specific models—validating its cross-task collaborative learning capability and robust multi-scale generalization.

Technology Category

Application Category

📝 Abstract

Foundation models have achieved remarkable success across video, image, and language domains. By scaling up the number of parameters and training datasets, these models acquire generalizable world knowledge and often surpass task-specific approaches. However, such progress has yet to extend to the domain of physics simulation. A primary bottleneck is data scarcity: while millions of images, videos, and textual resources are readily available on the internet, the largest physics simulation datasets contain only tens of thousands of samples. This data limitation hinders the use of large models, as overfitting becomes a major concern. As a result, physics applications typically rely on small models, which struggle with long-range prediction due to limited context understanding. Additionally, unlike images, videos, or text-which typically exhibit fixed granularity-physics datasets often vary drastically in scale, amplifying the challenges of scaling up multitask training. We introduce PhysiX, the first large-scale foundation model for physics simulation. PhysiX is a 4.5B parameter autoregressive generative model. It uses a discrete tokenizer to encode physical processes at different scales into a sequence of discrete tokens, and employs an autoregressive next-token prediction objective to model such processes in the token space. To mitigate the rounding error in the discretization process, PhysiX incorporates a specialized refinement module. Through extensive experiments, we show that PhysiX effectively addresses the data bottleneck, outperforming task-specific baselines under comparable settings as well as the previous absolute state-of-the-art approaches on The Well benchmark. Our results indicate that knowledge learned from natural videos can be successfully transferred to physics simulation, and that joint training across diverse simulation tasks enables synergistic learning.

Problem

Research questions and friction points this paper is trying to address.

Lack of large-scale physics simulation datasets hinders foundation models

Small models struggle with long-range predictions in physics

Physics datasets vary in scale, complicating multitask training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale autoregressive generative model for physics

Discrete tokenizer encodes multi-scale physical processes

Specialized refinement module reduces discretization errors

🔎 Similar Papers

Video-Driven Graph Network-Based Simulators