Heterogeneous Model Fusion for Privacy-Aware Multi-Camera Surveillance via Synthetic Domain Adaptation

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the challenges of cross-domain object detection in multi-camera surveillance scenarios, where privacy constraints, class imbalance, and model heterogeneity hinder effective domain adaptation. To tackle these issues, the authors propose HeroCrystal, a three-stage framework that enables privacy-preserving domain adaptation without accessing raw source data. First, a novel single-sample object-aware diffusion model is introduced to controllably synthesize rare object instances. Second, a dynamic model contrastive strategy fuses heterogeneous model architectures under strict privacy requirements. Finally, an inconsistent-class ensemble mechanism mitigates semantic discrepancies in label spaces across domains. Experimental results demonstrate that HeroCrystal achieves 33.4% mAP on multiple cross-domain benchmarks, outperforming existing privacy-preserving methods by 2.1% and significantly surpassing baselines in multi-source domain adaptation and federated learning.

📝 Abstract

We propose HeroCrystal, a novel privacy-preserving framework for multi-camera domain-adaptive object detection, addressing challenges such as data privacy, class imbalance, and heterogeneous architectures. Our framework consists of three key stages. In the Generated Stage, we introduce a one-shot, target-aware diffusion-based generation module that learns visual style from a single target-domain image while leveraging prompt-based control to synthesize specific object instances. Unlike conventional style transfer-based methods that require large target datasets and ignore semantic-level discrepancies, our approach enables privacy-preserving augmentation to reduce ethical concerns, and introduces controllable rare object generation to mitigate long-tailed category degradation. In the Federated Stage, we employ probabilistic Faster R-CNN on the client side to improve localization accuracy, and a dynamic model contrastive strategy to suppress domain-specific bias. The server side performs model fusion across heterogeneous architectures without accessing raw data. Finally, in the Distilled Stage, we propose an inconsistent categories integration algorithm to resolve label inconsistency and architecture heterogeneity across clients. Extensive experiments on multiple cross-domain detection benchmarks demonstrate that our method outperforms existing multi-source domain adaptation and federated learning baselines under multi-class, privacy-preserving settings. Our method improves mAP by +2.1% over prior privacy-preserving approaches and achieves a new state-of-the-art mAP of 33.4%, highlighting the effectiveness of HeroCrystal in enabling practical multi-camera AI surveillance systems.

Problem

Research questions and friction points this paper is trying to address.

privacy-aware surveillance

multi-camera object detection

domain adaptation

heterogeneous model fusion

class imbalance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic Domain Adaptation

Privacy-Preserving Surveillance

Heterogeneous Model Fusion