Detection and Identification of Penguins Using Appearance and Motion Features

πŸ“… 2026-03-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenge of individual detection and identification of penguins in animal facilities, where high visual similarity, variable postures, and water surface reflections severely hinder performance. To overcome these issues, the authors propose a unified detection–re-identification framework that jointly leverages appearance and motion cues. In the detection stage, multi-frame YOLOv11 is employed to enhance temporal consistency and improve robustness under occlusion. For re-identification, a tracklet-level contrastive learning strategy is introduced to effectively mitigate identity switches. Experimental results demonstrate that the proposed approach increases detection mAP@0.5 from 0.922 to 0.933 and successfully distinguishes individuals indistinguishable in static images, with feature embeddings exhibiting well-separated cluster structures.

Technology Category

Application Category

πŸ“ Abstract
In animal facilities, continuous surveillance of penguins is essential yet technically challenging due to their homogeneous visual characteristics, rapid and frequent posture changes, and substantial environmental noise such as water reflections. In this study, we propose a framework that enhances both detection and identification performance by integrating appearance and motion features. For detection, we adapted YOLO11 to process consecutive frames to overcome the lack of temporal consistency in single-frame detectors. This approach leverages motion cues to detect targets even when distinct visual features are obscured. Our evaluation shows that fine-tuning the model with two-frame inputs improves mAP@0.5 from 0.922 to 0.933, outperforming the baseline, and successfully recovers individuals that are indistinguishable in static images. For identification, we introduce a tracklet-based contrastive learning approach applied after tracking. Through qualitative visualization, we demonstrate that the method produces coherent feature embeddings, bringing samples from the same individual closer in the feature space, suggesting the potential for mitigating ID switching.
Problem

Research questions and friction points this paper is trying to address.

penguin detection
individual identification
visual homogeneity
motion features
environmental noise
Innovation

Methods, ideas, or system contributions that make the work stand out.

motion-aware detection
YOLOv11 adaptation
tracklet-based contrastive learning
temporal consistency
animal identification
πŸ”Ž Similar Papers
No similar papers found.
K
Kasumi Seko
School of Social Information Science, University of Hyogo, Hyogo, Japan
H
Hiroki Kinoshita
Graduate School of Information Science, University of Hyogo, Hyogo, Japan
R
Raj Rajeshwar Malinda
Graduate School of Information Science, University of Hyogo, Hyogo, Japan
Hiroaki Kawashima
Hiroaki Kawashima
University of Hyogo
Human Computer InteractionPattern RecognitionDistributed Coordinated ControlComputer Vision