Fine Flood Forecasts: Incorporating local data into global models through fine-tuning

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Global machine learning (ML) hydrological models suffer from poor regional generalizability, difficulty in local adaptation, and high operational deployment barriers. Method: This paper proposes a two-stage paradigm—“global pretraining + watershed-level fine-tuning”—leveraging multi-source, multi-scale hydrological time-series data to build a transferable foundation model, enabling national forecasting agencies to achieve lightweight adaptation using only limited local observations. Contribution/Results: We provide the first systematic empirical validation that localized observational data critically enhances regional performance of global ML hydrological models—particularly improving flood prediction accuracy in underfitting watersheds (average Nash–Sutcliffe Efficiency gain: 0.15–0.32). The approach significantly lowers technical and data requirements for model localization, offering hydro-meteorological agencies a plug-and-play, reusable pathway for customized forecast system upgrades.

Technology Category

Application Category

📝 Abstract
Floods are the most common form of natural disaster and accurate flood forecasting is essential for early warning systems. Previous work has shown that machine learning (ML) models are a promising way to improve flood predictions when trained on large, geographically-diverse datasets. This requirement of global training can result in a loss of ownership for national forecasters who cannot easily adapt the models to improve performance in their region, preventing ML models from being operationally deployed. Furthermore, traditional hydrology research with physics-based models suggests that local data -- which in many cases is only accessible to local agencies -- is valuable for improving model performance. To address these concerns, we demonstrate a methodology of pre-training a model on a large, global dataset and then fine-tuning that model on data from individual basins. This results in performance increases, validating our hypothesis that there is extra information to be captured in local data. In particular, we show that performance increases are most significant in watersheds that underperform during global training. We provide a roadmap for national forecasters who wish to take ownership of global models using their own data, aiming to lower the barrier to operational deployment of ML-based hydrological forecast systems.
Problem

Research questions and friction points this paper is trying to address.

Improving flood forecasts using local data in global models
Enabling national forecasters to adapt global ML models locally
Enhancing model performance in underperforming watersheds via fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning global models with local basin data
Pre-training on large global datasets first
Boosting performance in underperforming watersheds
🔎 Similar Papers
No similar papers found.