π€ AI Summary
This work addresses the challenge of automatically counting homogeneous 3D objects stacked under severe occlusion in industrial inspection scenarios where weighing or manual counting is infeasible. The authors propose a novel approach that integrates 3D geometric reconstruction with deep learning, decomposing the counting task into two subtasks: multi-view image-driven 3D reconstruction and occupancy estimation. By jointly leveraging depth estimation and geometric modeling, the method infers the distribution of objects in occluded regions within a container. This represents the first successful deep integration of 3D geometry and deep learning for highly occluded stacking counting, overcoming the limitations of conventional 2D methods. Extensive experiments on large-scale synthetic and real-world industrial datasets demonstrate the methodβs high accuracy and robustness, confirming its suitability for deployment in practical production-line inspection systems.
π Abstract
Visual object counting is a fundamental computer vision task in industrial inspection, where accurate, high-throughput inventory tracking and quality assurance are critical. Moreover, manufactured parts are often too light to reliably deduce their count from their weight, or too heavy to move the stack on a scale safely and practically, making automated visual counting the more robust solution in many scenarios. However, existing methods struggle with stacked 3D items in containers, pallets, or bins, where most objects are heavily occluded and only a few are directly visible. To address this important yet underexplored challenge, we propose a novel 3D counting approach that decomposes the task into two complementary subproblems: estimating the 3D geometry of the stack and its occupancy ratio from multi-view images. By combining geometric reconstruction with deep learning-based depth analysis, our method can accurately count identical manufactured parts inside containers, even when they are irregularly stacked and partially hidden. We validate our 3D counting pipeline on large-scale synthetic and diverse real-world data with manually verified total counts, demonstrating robust performance under realistic inspection conditions.