Vision transformer-based multi-camera multi-object tracking framework for dairy cow monitoring

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address labor-intensive, inconsistent, and delayed disease detection in manual cattle behavior monitoring, this paper proposes a multi-view collaborative real-time tracking system. Methodologically, it integrates YOLO11-m for detection, the zero-shot segmentation model SAMURAI, and a motion-aware memory mechanism within a ViT-based multi-object tracking framework; panoramic stitching via homography transformation and linear Kalman filter–IoU hybrid data association ensure cross-view identity consistency and high-precision instance segmentation. Key contributions include: (i) the first application of SAMURAI to livestock vision analytics; (ii) a novel motion-aware memory module enhancing temporal modeling; and (iii) redundancy elimination in overlapping regions via panoramic alignment. On two benchmark sequences, the system achieves MOTA of 98.7% and 99.3%, IDF1 > 99%, and near-zero ID switches—substantially outperforming Deep SORT. This work establishes a robust, scalable paradigm for quantitative behavioral analysis in smart farms.

Technology Category

Application Category

📝 Abstract
Activity and behaviour correlate with dairy cow health and welfare, making continual and accurate monitoring crucial for disease identification and farm productivity. Manual observation and frequent assessments are laborious and inconsistent for activity monitoring. In this study, we developed a unique multi-camera, real-time tracking system for indoor-housed Holstein Friesian dairy cows. This technology uses cutting-edge computer vision techniques, including instance segmentation and tracking algorithms to monitor cow activity seamlessly and accurately. An integrated top-down barn panorama was created by geometrically aligning six camera feeds using homographic transformations. The detection phase used a refined YOLO11-m model trained on an overhead cow dataset, obtaining high accuracy (mAP@0.50 = 0.97, F1 = 0.95). SAMURAI, an upgraded Segment Anything Model 2.1, generated pixel-precise cow masks for instance segmentation utilizing zero-shot learning and motion-aware memory. Even with occlusion and fluctuating posture, a motion-aware Linear Kalman filter and IoU-based data association reliably identified cows over time for object tracking. The proposed system significantly outperformed Deep SORT Realtime. Multi-Object Tracking Accuracy (MOTA) was 98.7% and 99.3% in two benchmark video sequences, with IDF1 scores above 99% and near-zero identity switches. This unified multi-camera system can track dairy cows in complex interior surroundings in real time, according to our data. The system reduces redundant detections across overlapping cameras, maintains continuity as cows move between viewpoints, with the aim of improving early sickness prediction through activity quantification and behavioural classification.
Problem

Research questions and friction points this paper is trying to address.

Develop real-time multi-camera tracking for dairy cow monitoring
Improve accuracy in cow activity and behavior analysis
Enhance early disease detection through automated behavioral classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-camera tracking with homographic transformations
YOLO11-m and SAMURAI for precise detection
Motion-aware Kalman filter for reliable tracking
🔎 Similar Papers
No similar papers found.
K
Kumail Abbas
International Graduate Program of Veterinary Science and Technology, Faculty of Veterinary Science, Chulalongkorn University, Bangkok, 10440, Thailand
Zeeshan Afzal
Zeeshan Afzal
Postdoc Researcher at Linköping University
Cyber SecurityNetwork SecurityCritical Infrastructure SecurityThreat Modeling
A
Aqeel Raza
Research Unit of Data Innovation for Livestock Development, Department of Veterinary Medicine, Faculty of Veterinary Science, Chulalongkorn University, Bangkok, 10330, Thailand
Taha Mansouri
Taha Mansouri
Lecturer in AI, University of Salford
Computer VisionEthical AI
A
Andrew W. Dowsey
Bristol Veterinary School, University of Bristol, United Kingdom
C
Chaidate Inchaisri
Research Unit of Data Innovation for Livestock Development, Department of Veterinary Medicine, Faculty of Veterinary Science, Chulalongkorn University, Bangkok, 10330, Thailand
Ali Alameer
Ali Alameer
Lecturer in AI
Computer VisionEthical AI