Semi-Supervised Multi-View Crowd Counting by Ranking Multi-View Fusion Models

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-view crowd counting suffers from scarcity of multi-view annotated data and limitations in scene diversity and frame count. Method: This paper proposes a semi-supervised learning framework leveraging model prediction consistency and uncertainty-aware ranking. It introduces a novel multi-view fusion model and, for the first time, encodes monotonicity of count predictions across views as a semi-supervised regularization prior. The framework integrates Monte Carlo Dropout for uncertainty estimation and enforces dual ranking constraints—prediction ranking and uncertainty ranking—to enhance robustness and generalization. Contribution/Results: Evaluated under limited labeling budgets, the method reduces mean absolute error by 12.7% over state-of-the-art semi-supervised approaches. Gains are especially pronounced in heavily occluded scenes, significantly alleviating reliance on densely annotated multi-view data.

Technology Category

Application Category

📝 Abstract
Multi-view crowd counting has been proposed to deal with the severe occlusion issue of crowd counting in large and wide scenes. However, due to the difficulty of collecting and annotating multi-view images, the datasets for multi-view counting have a limited number of multi-view frames and scenes. To solve the problem of limited data, one approach is to collect synthetic data to bypass the annotating step, while another is to propose semi- or weakly-supervised or unsupervised methods that demand less multi-view data. In this paper, we propose two semi-supervised multi-view crowd counting frameworks by ranking the multi-view fusion models of different numbers of input views, in terms of the model predictions or the model uncertainties. Specifically, for the first method (vanilla model), we rank the multi-view fusion models' prediction results of different numbers of camera-view inputs, namely, the model's predictions with fewer camera views shall not be larger than the predictions with more camera views. For the second method, we rank the estimated model uncertainties of the multi-view fusion models with a variable number of view inputs, guided by the multi-view fusion models' prediction errors, namely, the model uncertainties with more camera views shall not be larger than those with fewer camera views. These constraints are introduced into the model training in a semi-supervised fashion for multi-view counting with limited labeled data. The experiments demonstrate the advantages of the proposed multi-view model ranking methods compared with other semi-supervised counting methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses limited multi-view crowd counting data via semi-supervised ranking methods
Reduces reliance on labeled data by ranking predictions and uncertainties across views
Improves multi-view fusion models under severe occlusion in wide scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ranking multi-view fusion models by prediction consistency
Ranking multi-view fusion models by uncertainty estimation
Semi-supervised training with limited labeled multi-view data
🔎 Similar Papers
No similar papers found.
Q
Qi Zhang
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Y
Yunfei Gong
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Z
Zhidan Xie
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Z
Zhizi Wang
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Antoni B. Chan
Antoni B. Chan
Professor of Computer Science, City University of Hong Kong
Computer VisionMachine LearningSurveillanceEye Gaze AnalysisComputer Audition
H
Hui Huang
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China