🤖 AI Summary
Historical black-and-white aerial imagery suffers from low resolution, absence of color information, and archival degradation, leading to severe performance degradation in modern object detection models. To address this, we propose a two-stage GAN-based enhancement framework: first, DeOldify is employed for colorization; second, Real-ESRGAN performs super-resolution reconstruction. This work is the first to cascade color restoration and resolution enhancement specifically for historical aerial photographs, effectively bridging the domain gap between legacy analog imagery and contemporary deep learning models. We evaluate the framework on Faster R-CNN, DETReg, and YOLOv11n. Results show that YOLOv11n achieves an mAP of 85.2% on enhanced images—approximately 40 percentage points higher than on original grayscale imagery and 20 points higher than on monochrome-enhanced imagery—demonstrating substantial improvements in robustness and accuracy for building footprint extraction.
📝 Abstract
Accurate rooftop detection from historical aerial imagery is vital for examining long-term urban development and human settlement patterns. However, black-and-white analog photographs pose significant challenges for modern object detection frameworks due to their limited spatial resolution, lack of color information, and archival degradation. To address these limitations, this study introduces a two-stage image enhancement pipeline based on Generative Adversarial Networks (GANs): image colorization using DeOldify, followed by super-resolution enhancement with Real-ESRGAN. The enhanced images were then used to train and evaluate rooftop detection models, including Faster R-CNN, DETReg, and YOLOv11n. Results show that combining colorization with super-resolution substantially improves detection performance, with YOLOv11n achieving a mean Average Precision (mAP) exceeding 85%. This reflects an improvement of approximately 40% over original black-and-white images and 20% over images enhanced through colorization alone. The proposed method effectively bridges the gap between archival imagery and contemporary deep learning techniques, enabling more reliable extraction of building footprints from historical aerial photographs.