VICON: A Foundation Model for Multi-Physics Fluid Dynamics via Vision In-Context Operator Networks

📅 2024-11-25
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
To address the low computational efficiency, poor scalability, and inability to model two-dimensional functions of In-Context Operator Networks on high-dimensional dense data, this work pioneers the integration of Vision Transformers into the in-context operator learning framework. We propose a multi-physics fluid modeling approach based on patch-wise function representation, enabling dynamic context construction, flexible handling of variable time steps and sparse-frame inputs, and enhanced generalization via a multi-physics pretraining paradigm. Evaluated on two benchmark datasets for compressible flow, our method reduces normalized L² error by 40% and 61.6%, respectively, and achieves inference speed three times faster than the state-of-the-art MPP model. Moreover, it significantly improves long-horizon roll-out prediction efficiency.

Technology Category

Application Category

📝 Abstract
In-Context Operator Networks (ICONs) are models that learn operators across different types of PDEs using a few-shot, in-context approach. Although they show successful generalization to various PDEs, existing methods treat each data point as a single token, and suffer from computational inefficiency when processing dense data, limiting their application in higher spatial dimensions. In this work, we propose extit{Vision In-Context Operator Networks} (VICON), incorporating a vision transformer architecture that efficiently processes 2D functions through patch-wise operations. We evaluated our method on three fluid dynamics datasets, demonstrating both superior performance (reducing the rescaled $L^2$ error by $40%$ and $61.6%$ for two benchmark datasets for compressible flows, respectively) and computational efficiency (requiring only one-third of the inference time per frame) in long-term rollout predictions compared to the current state-of-the-art sequence-to-sequence model with fixed timestep prediction: Multiple Physics Pretraining (MPP). Compared to MPP, our method preserves the benefits of in-context operator learning, enabling flexible context formation when dealing with insufficient frame counts or varying timestep values.
Problem

Research questions and friction points this paper is trying to address.

Efficiently process dense data
Improve long-term rollout predictions
Enable flexible context formation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision transformer enhances 2D processing
Patch-wise operations improve computational efficiency
In-context learning for flexible PDE solutions
🔎 Similar Papers