🤖 AI Summary
To address labor-intensive, inconsistent, and delayed disease detection in manual cattle behavior monitoring, this paper proposes a multi-view collaborative real-time tracking system. Methodologically, it integrates YOLO11-m for detection, the zero-shot segmentation model SAMURAI, and a motion-aware memory mechanism within a ViT-based multi-object tracking framework; panoramic stitching via homography transformation and linear Kalman filter–IoU hybrid data association ensure cross-view identity consistency and high-precision instance segmentation. Key contributions include: (i) the first application of SAMURAI to livestock vision analytics; (ii) a novel motion-aware memory module enhancing temporal modeling; and (iii) redundancy elimination in overlapping regions via panoramic alignment. On two benchmark sequences, the system achieves MOTA of 98.7% and 99.3%, IDF1 > 99%, and near-zero ID switches—substantially outperforming Deep SORT. This work establishes a robust, scalable paradigm for quantitative behavioral analysis in smart farms.
📝 Abstract
Activity and behaviour correlate with dairy cow health and welfare, making continual and accurate monitoring crucial for disease identification and farm productivity. Manual observation and frequent assessments are laborious and inconsistent for activity monitoring. In this study, we developed a unique multi-camera, real-time tracking system for indoor-housed Holstein Friesian dairy cows. This technology uses cutting-edge computer vision techniques, including instance segmentation and tracking algorithms to monitor cow activity seamlessly and accurately. An integrated top-down barn panorama was created by geometrically aligning six camera feeds using homographic transformations. The detection phase used a refined YOLO11-m model trained on an overhead cow dataset, obtaining high accuracy (mAP@0.50 = 0.97, F1 = 0.95). SAMURAI, an upgraded Segment Anything Model 2.1, generated pixel-precise cow masks for instance segmentation utilizing zero-shot learning and motion-aware memory. Even with occlusion and fluctuating posture, a motion-aware Linear Kalman filter and IoU-based data association reliably identified cows over time for object tracking. The proposed system significantly outperformed Deep SORT Realtime. Multi-Object Tracking Accuracy (MOTA) was 98.7% and 99.3% in two benchmark video sequences, with IDF1 scores above 99% and near-zero identity switches. This unified multi-camera system can track dairy cows in complex interior surroundings in real time, according to our data. The system reduces redundant detections across overlapping cameras, maintains continuity as cows move between viewpoints, with the aim of improving early sickness prediction through activity quantification and behavioural classification.