DSGC-Net: A Dual-Stream Graph Convolutional Network for Crowd Counting via Feature Correlation Mining

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In complex crowd scenes, uneven density distributions and variations in viewpoint and pose lead to inconsistent individual representations. To address this, we propose a Dual-Stream Graph Convolutional Network (Dual-Stream GCN) that innovatively constructs two complementary semantic graph structures: a density-driven graph modeling spatial density correlations, and a representation-driven graph leveraging global feature similarity to learn pose-robust individual representations. Our method jointly integrates a density prediction module with graph convolutional semantic relation modeling, significantly enhancing adaptability to multi-scale density variations and geometric deformations. Evaluated on ShanghaiTech Part A and Part B, our approach achieves mean absolute errors (MAE) of 48.9 and 5.9, respectively—outperforming state-of-the-art methods. These results validate the effectiveness and generalizability of dual-semantic graph modeling for crowd counting.

Technology Category

Application Category

📝 Abstract
Deep learning-based crowd counting methods have achieved remarkable progress in recent years. However, in complex crowd scenarios, existing models still face challenges when adapting to significant density distribution differences between regions. Additionally, the inconsistency of individual representations caused by viewpoint changes and body posture differences further limits the counting accuracy of the models. To address these challenges, we propose DSGC-Net, a Dual-Stream Graph Convolutional Network based on feature correlation mining. DSGC-Net introduces a Density Approximation (DA) branch and a Representation Approximation (RA) branch. By modeling two semantic graphs, it captures the potential feature correlations in density variations and representation distributions. The DA branch incorporates a density prediction module that generates the density distribution map, and constructs a density-driven semantic graph based on density similarity. The RA branch establishes a representation-driven semantic graph by computing global representation similarity. Then, graph convolutional networks are applied to the two semantic graphs separately to model the latent semantic relationships, which enhance the model's ability to adapt to density variations and improve counting accuracy in multi-view and multi-pose scenarios. Extensive experiments on three widely used datasets demonstrate that DSGC-Net outperforms current state-of-the-art methods. In particular, we achieve MAE of 48.9 and 5.9 in ShanghaiTech Part A and Part B datasets, respectively. The released code is available at: https://github.com/Wu-eon/CrowdCounting-DSGCNet.
Problem

Research questions and friction points this paper is trying to address.

Addressing density distribution differences in crowd counting
Improving individual representation consistency across viewpoints
Enhancing counting accuracy in multi-view and multi-pose scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-stream network with density and representation branches
Graph convolutional networks model feature correlations
Semantic graphs capture density and representation similarities
🔎 Similar Papers
No similar papers found.
Y
Yihong Wu
Taiyuan University of Technology, Taiyuan 030024, China
J
Jinqiao Wei
Taiyuan University of Technology, Taiyuan 030024, China
X
Xionghui Zhao
Taiyuan University of Technology, Taiyuan 030024, China
Y
Yidi Li
Taiyuan University of Technology, Taiyuan 030024, China
Shaoyi Du
Shaoyi Du
Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University
Pattern RecognitionComputer VisionImage Processing
B
Bin Ren
University of Trento, Trento, Italy; University of Pisa, Pisa, Italy
Nicu Sebe
Nicu Sebe
University of Trento
computer visionmultimedia