Semantically Stable Image Composition Analysisvia Saliency and Gradient Vector Flow Fusion

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

205K/year
🤖 AI Summary
This work addresses the challenge of simultaneously achieving semantic robustness and high-quality visual layout assessment in image composition evaluation by proposing VFCNet. The method introduces a novel approach that fuses saliency and edge information into a Gradient Vector Flow (GVF) field, leveraging a dual-stream GVF representation, attention-based fusion, and multi-scale features extracted from a DINOv3 self-supervised backbone to construct composition representations robust to low-level semantic variations. Remarkably, even a simple classifier using only DINOv3 features outperforms existing sophisticated task-specific models, achieving state-of-the-art performance on the PICD benchmark with CDA-1 and CDA-2 scores of 0.683 and 0.629, respectively—improvements of 33.1% and 36.1% over the previous best method.

Technology Category

Application Category

📝 Abstract
The reliable computational assessment of photographic composition requires features that are discriminative of spatial layout yet robust to semantic content. This paper proposes a low-level representation grounded in the assumption that composition can be understood as the flow of visual attention across geometric structure. We introduce VFCNet, which fuses saliency and edge information into a gradient vector flow (GVF) field. The model computes dual-stream GVF representations, integrates them via attention, and extracts multi-scale flow features with a DINOv3 backbone. VFCNet achieves state-of-the-art performance on the PICD benchmark (CDA-1: 0.683, CDA-2: 0.629), improving by 33.1\% and 36.1\% over the previous best method. We also show that a simple classifier on self-supervised DINOv3 features substantially outperforms more sophisticated, composition-specialized models. Code is available at https://github.com/ADadras/VFCNet
Problem

Research questions and friction points this paper is trying to address.

image composition
semantic robustness
saliency
gradient vector flow
computational aesthetics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient Vector Flow
Saliency Fusion
Visual Composition
DINOv3
Self-supervised Representation
🔎 Similar Papers
2024-02-12International Conference on Information PhotonicsCitations: 1