Synergistic Perception and Generative Recomposition: A Multi-Agent Orchestration for Expert-Level Building Inspection

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of facade defect detection, where geometric variability, complex backgrounds, low contrast, compound defects, and scarce pixel-level annotations severely limit model generalization. To overcome these issues, the authors propose FacadeFixer, a multi-agent collaborative framework that formulates defect perception as a cooperative reasoning task. Dedicated detection and segmentation agents handle diverse defect types, while a generative agent decouples complex defects through semantic recombination and synthesizes them onto varied clean textures to produce high-fidelity augmented data with expert masks. This study presents the first integration of multi-agent collaborative reasoning with generative semantic recombination, effectively alleviating annotation scarcity and enhancing generalization. The authors also introduce the first multi-task dataset covering six facade categories with pixel-level annotations, demonstrating significant performance gains over existing methods—particularly in pixel-level detection of structural anomalies.

Technology Category

Application Category

📝 Abstract
Building facade defect inspection is fundamental to structural health monitoring and sustainable urban maintenance, yet it remains a formidable challenge due to extreme geometric variability, low contrast against complex backgrounds, and the inherent complexity of composite defects (e.g., cracks co-occurring with spalling). Such characteristics lead to severe pixel imbalance and feature ambiguity, which, coupled with the critical scarcity of high-quality pixel-level annotations, hinder the generalization of existing detection and segmentation models. To address gaps, we propose \textit{FacadeFixer}, a unified multi-agent framework that treats defect perception as a collaborative reasoning task rather than isolated recognition. Specifically,\textit{FacadeFixer} orchestrates specialized agents for detection and segmentation to handle multi-type defect interference, working in tandem with a generative agent to enable semantic recomposition. This process decouples intricate defects from noisy backgrounds and realistically synthesizes them onto diverse clean textures, generating high-fidelity augmented data with precise expert-level masks. To support this, we introduce a comprehensive multi-task dataset covering six primary facade categories with pixel-level annotations. Extensive experiments demonstrate that \textit{FacadeFixer} significantly outperforms state-of-the-art (SOTA) baselines. Specifically, it excels in capturing pixel-level structural anomalies and highlights generative synthesis as a robust solution to data scarcity in infrastructure inspection. Our code and dataset will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

building facade inspection
composite defects
pixel-level annotation scarcity
feature ambiguity
data imbalance
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent orchestration
generative recomposition
semantic synthesis
pixel-level annotation
defect decoupling
🔎 Similar Papers
No similar papers found.
Hui Zhong
Hui Zhong
The Hong Kong University of Science and Technology (Guangzhou)
data miningurban sciencesustainable transportation
Y
Yichun Gao
The Hong Kong University of Science and Technology (Guangzhou), Systems Hub, China; Guangdong Provincial Key Lab of Integrated Communication, Sensing and Computation for Ubiquitous Internet of Things, Guangdong, China
L
Luyan Liu
Guangdong Provincial Key Lab of Integrated Communication, Sensing and Computation for Ubiquitous Internet of Things, Guangdong, China; The Hong Kong University of Science and Technology, Hongkong
X
Xusen Guo
The Hong Kong University of Science and Technology (Guangzhou), Systems Hub, China; Guangdong Provincial Key Lab of Integrated Communication, Sensing and Computation for Ubiquitous Internet of Things, Guangdong, China
Z
Zhaonian Kuang
The Hong Kong University of Science and Technology (Guangzhou), Systems Hub, China; Guangdong Provincial Key Lab of Integrated Communication, Sensing and Computation for Ubiquitous Internet of Things, Guangdong, China; College of Artificial Intelligence, Xi’An Jiaotong University, Xi’An, China
Q
Qiming Zhang
The Hong Kong University of Science and Technology (Guangzhou), Systems Hub, China
Xinhu Zheng
Xinhu Zheng
Assistant Professor, The Hong Kong University of Science and Technology (Guangzhou)