BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting

📅 2025-02-26

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the small-sample generalization bottleneck in physical spatiotemporal forecasting caused by scarcity of extreme-event data, this paper proposes a probabilistic self-training framework integrating vector quantization (VQ) and continuous-space beam search. Methodologically, it pioneers the extension of discrete beam search to continuous physical state spaces; VQ enables deterministic encoding and retrieval from a learned codebook to generate diverse candidate predictions, while physics-informed metrics (e.g., Critical Success Index) guide candidate selection and self-ensembling—forming a closed-loop, annotation-free iterative refinement process. Key contributions include: (1) an extreme-event-oriented, metric-driven self-ensembling strategy; (2) a fully self-supervised training paradigm requiring no additional labels; and (3) active exploration capability for rare phenomena. Evaluated across multiple benchmarks and backbone architectures, the method achieves a 39% average reduction in MSE and significantly improves extreme-event detection accuracy and model robustness.

Technology Category

Application Category

📝 Abstract

In practice, physical spatiotemporal forecasting can suffer from data scarcity, because collecting large-scale data is non-trivial, especially for extreme events. Hence, we propose method{}, a novel probabilistic framework to realize iterative self-training with new self-ensemble strategies, achieving better physical consistency and generalization on extreme events. Following any base forecasting model, we can encode its deterministic outputs into a latent space and retrieve multiple codebook entries to generate probabilistic outputs. Then BeamVQ extends the beam search from discrete spaces to the continuous state spaces in this field. We can further employ domain-specific metrics (e.g., Critical Success Index for extreme events) to filter out the top-k candidates and develop the new self-ensemble strategy by combining the high-quality candidates. The self-ensemble can not only improve the inference quality and robustness but also iteratively augment the training datasets during continuous self-training. Consequently, BeamVQ realizes the exploration of rare but critical phenomena beyond the original dataset. Comprehensive experiments on different benchmarks and backbones show that BeamVQ consistently reduces forecasting MSE (up to 39%), enhancing extreme events detection and proving its effectiveness in handling data scarcity.

Problem

Research questions and friction points this paper is trying to address.

Mitigates data scarcity in forecasting

Enhances extreme events detection

Improves physical consistency and generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Beam search in continuous spaces

Self-ensemble strategy enhancement

Iterative self-training with augmentation

🔎 Similar Papers

Does Vector Quantization Fail in Spatio-Temporal Forecasting? Exploring a Differentiable Sparse Soft-Vector Quantization Approach