🤖 AI Summary
Existing methods for building facade parsing struggle to maintain structural consistency under occlusion and perspective distortion, often yielding geometrically irregular layouts. This work proposes a lightweight alignment loss integrated into the YOLOv8 training objective, which injects grid-alignment geometric priors without altering the inference pipeline. By guiding bounding boxes to adhere to regular spatial arrangements, the method effectively corrects alignment errors caused by occlusion and perspective effects on the CMP dataset. It achieves a controllable trade-off between detection accuracy and geometric regularity, significantly enhancing the structural plausibility of parsed facades while preserving high detection performance.
📝 Abstract
Standard object detectors typically treat architectural elements independently, often resulting in facade parsings that lack the structural coherence required for downstream procedural reconstruction. We address this limitation by augmenting the YOLOv8 training objective with a custom lightweight alignment loss. This regularization encourages grid-consistent arrangements of bounding boxes during training, effectively injecting geometric priors without altering the standard inference pipeline. Experiments on the CMP dataset demonstrate that our method successfully improves structural regularity, correcting alignment errors caused by perspective and occlusion while maintaining a controllable trade-off with standard detection accuracy.