๐ค AI Summary
This study addresses the challenges of limited interpretability and high training costs in real-time property loss prediction following storms. The authors propose a training-free, inference-augmented retrieval-augmented generation framework that constructs a knowledge base comprising structured features, natural language summaries, and reasoning trajectories. During inference, the method retrieves reasoning paths from geographically proximate samples and class prototypes to guide a large language model in reusing established logical patterns for two-stage damage prediction. Innovatively, reasoning trajectories are introduced as knowledge units into flood damage estimation, complemented by a conditional degradation mechanism to refine severity assessment. Evaluated on a Harris County case study, a lightweight variant achieves accuracy ranging from 0.757 to 0.896โcomparable to a supervised baseline (0.859)โwhile significantly improving cost efficiency and providing structured, interpretable justifications.
๐ Abstract
R2RAG-Flood is a reasoning-reinforced, training-free retrieval-augmented generation framework for post-storm property damage nowcasting. Building on an existing supervised tabular predictor, the framework constructs a reasoning-centric knowledge base composed of labeled tabular records, where each sample includes structured predictors, a compact natural language text-mode summary, and a model-generated reasoning trajectory. During inference, R2RAG-Flood issues context-augmented prompts that retrieve and condition on relevant reasoning trajectories from nearby geospatial neighbors and canonical class prototypes, enabling the large language model backbone to emulate and adapt prior reasoning rather than learn new task-specific parameters. Predictions follow a two-stage procedure that first determines property damage occurrence and then refines severity within a three-level Property Damage Extent categorization, with a conditional downgrade step to correct over-predicted severity. In a case study of Harris County, Texas at the 12-digit Hydrologic Unit Code scale, the supervised tabular baseline trained directly on structured predictors achieves 0.714 overall accuracy and 0.859 damage class accuracy for medium and high damage classes. Across seven large language model backbones, R2RAG-Flood attains 0.613 to 0.668 overall accuracy and 0.757 to 0.896 damage class accuracy, approaching the supervised baseline while additionally producing a structured rationale for each prediction. Using a severity-per-cost efficiency metric derived from API pricing and GPU instance costs, lightweight R2RAG-Flood variants demonstrate substantially higher efficiency than both the supervised tabular baseline and larger language models, while requiring no task-specific training or fine-tuning.