🤖 AI Summary
To address the challenges of insufficient coordination between local details and global semantics, and noise susceptibility in multi-scale prediction fusion for deepfake detection and localization, this paper proposes a multi-scale morphological optimization fusion framework. Methodologically, it introduces a dual-branch architecture comprising local feature extraction and mesoscopic semantic analysis, and innovatively incorporates morphological opening and closing operations to enforce spatial consistency across scales and suppress prediction noise. This design effectively mitigates error amplification inherent in conventional fusion strategies, thereby significantly improving localization accuracy and robustness. Extensive experiments on multiple benchmark datasets demonstrate that the proposed method consistently outperforms state-of-the-art approaches in both localization accuracy and model stability, validating the effectiveness and generalizability of morphologically guided multi-scale fusion.
📝 Abstract
While the pursuit of higher accuracy in deepfake detection remains a central goal, there is an increasing demand for precise localization of manipulated regions. Despite the remarkable progress made in classification-based detection, accurately localizing forged areas remains a significant challenge. A common strategy is to incorporate forged region annotations during model training alongside manipulated images. However, such approaches often neglect the complementary nature of local detail and global semantic context, resulting in suboptimal localization performance. Moreover, an often-overlooked aspect is the fusion strategy between local and global predictions. Naively combining the outputs from both branches can amplify noise and errors, thereby undermining the effectiveness of the localization.
To address these issues, we propose a novel approach that independently predicts manipulated regions using both local and global perspectives. We employ morphological operations to fuse the outputs, effectively suppressing noise while enhancing spatial coherence. Extensive experiments reveal the effectiveness of each module in improving the accuracy and robustness of forgery localization.