🤖 AI Summary
In pasture monitoring, cattle behavioral interaction analysis—particularly estrus detection—is hindered by the scarcity of large-scale annotated datasets and fine-grained interaction labels. Method: This paper proposes CattleAct, the first framework to decouple group-level interactions into compositional individual actions and jointly embed actions and interactions into a unified latent space. It employs contrastive learning to fine-tune pretrained action representations, enabling data-efficient generalization to rare interactions. Furthermore, it fuses multimodal cues—including video appearance features and GPS trajectory sequences—to enhance robustness in single-frame interaction discrimination. Contribution/Results: Evaluated in real-world pasture deployments, CattleAct significantly outperforms existing baselines, achieving state-of-the-art accuracy in cattle interaction detection. To foster reproducibility and community advancement, we publicly release all source code and data processing tools.
📝 Abstract
This paper introduces a method and application for automatically detecting behavioral interactions between grazing cattle from a single image, which is essential for smart livestock management in the cattle industry, such as for detecting estrus. Although interaction detection for humans has been actively studied, a non-trivial challenge lies in cattle interaction detection, specifically the lack of a comprehensive behavioral dataset that includes interactions, as the interactions of grazing cattle are rare events. We, therefore, propose CattleAct, a data-efficient method for interaction detection by decomposing interactions into the combinations of actions by individual cattle. Specifically, we first learn an action latent space from a large-scale cattle action dataset. Then, we embed rare interactions via the fine-tuning of the pre-trained latent space using contrastive learning, thereby constructing a unified latent space of actions and interactions. On top of the proposed method, we develop a practical working system integrating video and GPS inputs. Experiments on a commercial-scale pasture demonstrate the accurate interaction detection achieved by our method compared to the baselines. Our implementation is available at https://github.com/rakawanegan/CattleAct.