Does Your Wildfire Prediction Model Actually Work, or Just Score Well?

๐Ÿ“… 2026-05-14
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

188K/year
๐Ÿค– AI Summary
Existing Earth foundation models are not optimized for wildfire prediction, and their evaluation is prone to bias due to event sparsity and sensitivity to matching rules, leading to unreliable conclusions about transfer performance. This work proposes WILDFIRE-FM, the first domain-specific foundation model tailored for wildfire forecasting, which is pretrained on a multimodal fusion of weather, active fire detections, topography, vegetation, and static environmental data. To enable rigorous assessment, the authors introduce a โ€œfixed contractโ€ evaluation framework that controls for output and feature alignment, systematically measuring transferability across four task types: occupancy, spread, retrieval, and regression. Experiments reveal that transfer performance in wildfire prediction is highly dependent on both evaluation design and task formulation, underscoring the critical influence of methodology on empirical conclusions.
๐Ÿ“ Abstract
Wildfire prediction is important for early warning and resource allocation, yet existing Earth foundation models (Earth FMs) are pretrained for general atmospheric and geophysical objectives rather than wildfire forecasting. To address this gap, we introduce WILDFIRE-FM, the first foundation model pretrained specifically for wildfire prediction using weather, active-fire observations, topography, vegetation, and static environmental data. However, introducing a domain-specific backbone alone does not solve the evaluation problem: wildfire events are sparse in space and time, making transfer conclusions highly sensitive to matching rules and evaluation settings. To address this problem, we introduce a fixed-contract evaluation framework with two controlled checks: a fixed-output check for matching-rule effects and a fixed-feature check for head-selection effects. Under matched contracts, we compare WILDFIRE-FM with ten Earth-FM baselines across occupancy, spread, retrieval, and regression tasks. Our results show that wildfire transfer conclusions depend strongly on evaluation design and task formulation. We hope this framework and WILDFIRE-FM provide a foundation for future wildfire-specific Earth-FM research and benchmarking. Our code is available at https://anonymous.4open.science/r/Wildfire-fm-evaluation-contracts-5AE9/.
Problem

Research questions and friction points this paper is trying to address.

wildfire prediction
Earth foundation models
evaluation framework
transfer learning
sparse events
Innovation

Methods, ideas, or system contributions that make the work stand out.

wildfire prediction
foundation model
evaluation framework
fixed-contract evaluation
Earth foundation models