🤖 AI Summary
This study addresses the prevailing focus on negative indicators in automated animal welfare monitoring by introducing a novel approach to recognizing positive behaviors—specifically, play in poultry. The work proposes a top-view video–based classification pipeline that, for the first time, integrates frozen image and video foundation models for this purpose. It employs a three-stage chunked long-term tracking strategy guided by YOLO-informed SAM to mitigate identity confusion and combines handcrafted motion features with V-JEPA 2.1 video embeddings. Evaluated across multiple model scales, the method achieves state-of-the-art performance with a macro-averaged F₁ score of 77.0, significantly advancing the automated assessment of positive animal behaviors.
📝 Abstract
Automated monitoring of animal welfare has largely targeted negative indicators, leaving positive welfare behaviours such as play underexplored. To address this gap, we present PlayClass, a pipeline for play-behaviour classification in poultry from top-down pen video. The pipeline leverages long-duration tracking with SAM 3 via YOLO-guided chunk boundaries to minimise identity errors in point-based prompting, and frozen embeddings from image and video foundation models for play action classification. Although handcrafted motion features from tracked masks alone achieved competitive accuracy, V-JEPA 2.1 consistently outperformed all other backbones across model scales, reaching 77.0 macro-averaged F$_1$ when combined with handcrafted features. Despite this result, the dataset remains challenging due to play sub-types sharing similar kinematic profiles with non-play and inter-bird occlusion. Overall, our work provides encouraging evidence towards automated frameworks for play behaviour classification in poultry.