BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the small-sample generalization bottleneck in physical spatiotemporal forecasting caused by scarcity of extreme-event data, this paper proposes a probabilistic self-training framework integrating vector quantization (VQ) and continuous-space beam search. Methodologically, it pioneers the extension of discrete beam search to continuous physical state spaces; VQ enables deterministic encoding and retrieval from a learned codebook to generate diverse candidate predictions, while physics-informed metrics (e.g., Critical Success Index) guide candidate selection and self-ensembling—forming a closed-loop, annotation-free iterative refinement process. Key contributions include: (1) an extreme-event-oriented, metric-driven self-ensembling strategy; (2) a fully self-supervised training paradigm requiring no additional labels; and (3) active exploration capability for rare phenomena. Evaluated across multiple benchmarks and backbone architectures, the method achieves a 39% average reduction in MSE and significantly improves extreme-event detection accuracy and model robustness.

Technology Category

Application Category

📝 Abstract
In practice, physical spatiotemporal forecasting can suffer from data scarcity, because collecting large-scale data is non-trivial, especially for extreme events. Hence, we propose method{}, a novel probabilistic framework to realize iterative self-training with new self-ensemble strategies, achieving better physical consistency and generalization on extreme events. Following any base forecasting model, we can encode its deterministic outputs into a latent space and retrieve multiple codebook entries to generate probabilistic outputs. Then BeamVQ extends the beam search from discrete spaces to the continuous state spaces in this field. We can further employ domain-specific metrics (e.g., Critical Success Index for extreme events) to filter out the top-k candidates and develop the new self-ensemble strategy by combining the high-quality candidates. The self-ensemble can not only improve the inference quality and robustness but also iteratively augment the training datasets during continuous self-training. Consequently, BeamVQ realizes the exploration of rare but critical phenomena beyond the original dataset. Comprehensive experiments on different benchmarks and backbones show that BeamVQ consistently reduces forecasting MSE (up to 39%), enhancing extreme events detection and proving its effectiveness in handling data scarcity.
Problem

Research questions and friction points this paper is trying to address.

Mitigates data scarcity in forecasting
Enhances extreme events detection
Improves physical consistency and generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Beam search in continuous spaces
Self-ensemble strategy enhancement
Iterative self-training with augmentation
🔎 Similar Papers
No similar papers found.
Weiyan Wang
Weiyan Wang
Tencent
Machine Learning SystemHigh Performance Computing
Xingjian Shi
Xingjian Shi
OpenAI
Deep LearningComputer VisionNatural Language ProcessingMultimodalSpeech
R
Ruiqi Shu
Department of Earth System Science, Ministry of Education Key Laboratory for Earth System Modeling, Institute for Global Change Studies, Tsinghua University
Y
Yuan Gao
Department of Earth System Science, Ministry of Education Key Laboratory for Earth System Modeling, Institute for Global Change Studies, Tsinghua University
Rui Ray Chen
Rui Ray Chen
Tsinghua University
Artificial Intelligence
K
Kun Wang
School of Computer Science and Engineering, Nanyang Technological University
F
Fan Xu
Department and Computer and Science, University of Science and Technology of China
J
Jinbao Xue
TEG, Tencent
Shuaipeng Li
Shuaipeng Li
Tencent
Y
Yangyu Tao
TEG, Tencent
D
Di Wang
TEG, Tencent
H
Hao Wu
Department of Earth System Science, Ministry of Education Key Laboratory for Earth System Modeling, Institute for Global Change Studies, Tsinghua University; Department and Computer and Science, University of Science and Technology of China
Xiaomeng Huang
Xiaomeng Huang
Tsinghua University
Earth System ModelHPCBig DataAI