Exclusivity-Guided Mask Learning for Semi-Supervised Crowd Instance Segmentation and Counting

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the challenge in semi-supervised crowd analysis where sparse point annotations lack fine-grained structural semantics, limiting performance in individual instance segmentation and counting under dense scenes. To overcome this, the authors propose Exclusive-guided Mask Learning (XMask), which generates spatially exclusive instance masks via a nearest-neighbor exclusion circular constraint. The method enhances feature continuity and training stability through Gaussian smoothing and differentiable center sampling, and—uniquely—employs shape-informative masks as pseudo-labels to unify segmentation and counting tasks. An EDP-SAM module built upon the SAM architecture further strengthens representation capability. Experiments demonstrate that with only 5%–40% labeled data, XMask achieves state-of-the-art performance on ShanghaiTech A, UCF-QNRF, and JHU++, significantly narrowing the performance gap between the two tasks.

Technology Category

Application Category

📝 Abstract

Semi-supervised crowd analysis is a prominent area of research, as unlabeled data are typically abundant and inexpensive to obtain. However, traditional point-based annotations constrain performance because individual regions are inherently ambiguous, and consequently, learning fine-grained structural semantics from sparse anno tations remains an unresolved challenge. In this paper, we first propose an Exclusion-Constrained Dual-Prompt SAM (EDP-SAM), based on our Nearest Neighbor Exclusion Circle (NNEC) constraint, to generate mask supervision for current datasets. With the aim of segmenting individuals in dense scenes, we then propose Exclusivity-Guided Mask Learning (XMask), which enforces spatial separation through a discriminative mask objective. Gaussian smoothing and a differentiable center sampling strategy are utilized to improve feature continuity and training stability. Building on XMask, we present a semi-supervised crowd counting framework that uses instance mask priors as pseudo-labels, which contain richer shape information than traditional point cues. Extensive experiments on the ShanghaiTech A, UCF-QNRF, and JHU++ datasets (using 5%, 10%, and 40% labeled data) verify that our end-to-end model achieves state-of-the-art semi-supervised segmentation and counting performance, effectively bridging the gap between counting and instance segmentation within a unified framework.

Problem

Research questions and friction points this paper is trying to address.

semi-supervised learning

crowd instance segmentation

crowd counting

point-based annotation

structural semantics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exclusivity-Guided Mask Learning

Semi-Supervised Crowd Segmentation

Instance Mask Priors

Nearest Neighbor Exclusion Circle

Differentiable Center Sampling

🔎 Similar Papers

No similar papers found.

Bosch Group

Hildesheim, NDS, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)