π€ AI Summary
Urban flood monitoring has long suffered from delayed manual reporting and sparse data, hindering real-time response. This paper proposes UWAssess, the first framework to synergistically integrate vision foundation models (VFMs) and large language models (LLMs) for automated flood assessment. It employs semi-supervised fine-tuning to enhance few-shot visual understanding and leverages chain-of-thought prompting to enable end-to-end generation of structured natural language reportsβfrom water accumulation detection to comprehensive situational analysis. UWAssess breaks the traditional perception-decision decoupling paradigm by jointly reasoning about inundation extent, depth, risk level, and socio-physical impact. Evaluated on multi-source visual benchmarks, it achieves significant improvements in detection accuracy. GPT-assisted evaluation confirms 92.3% accuracy in critical report elements. Deployed in three pilot cities, UWAssess demonstrates scalability and operational efficacy in real-world scenarios.
π Abstract
With climate change intensifying, urban waterlogging poses an increasingly severe threat to global public safety and infrastructure. However, existing monitoring approaches rely heavily on manual reporting and fail to provide timely and comprehensive assessments. In this study, we present Urban Waterlogging Assessment (UWAssess), a foundation model-driven framework that automatically identifies waterlogged areas in surveillance images and generates structured assessment reports. To address the scarcity of labeled data, we design a semi-supervised fine-tuning strategy and a chain-of-thought (CoT) prompting strategy to unleash the potential of the foundation model for data-scarce downstream tasks. Evaluations on challenging visual benchmarks demonstrate substantial improvements in perception performance. GPT-based evaluations confirm the ability of UWAssess to generate reliable textual reports that accurately describe waterlogging extent, depth, risk and impact. This dual capability enables a shift of waterlogging monitoring from perception to generation, while the collaborative framework of multiple foundation models lays the groundwork for intelligent and scalable systems, supporting urban management, disaster response and climate resilience.