WildLIFT: Lifting monocular drone video to 3D for species-agnostic wildlife monitoring

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This work addresses the limitation of existing monocular drone-based wildlife monitoring approaches, which predominantly rely on 2D imagery and neglect 3D geometric information. The authors propose a species-agnostic 3D detection and tracking framework that integrates monocular video-based 3D reconstruction with open-vocabulary 2D instance segmentation to produce oriented 3D bounding boxes enriched with semantic orientation. This is the first method to enable species-independent 3D perception for wildlife monitoring, facilitating viewpoint coverage and occlusion quantification analyses. By incorporating keyframe optimization, the approach significantly reduces annotation costs. Evaluated on a dataset comprising 2,581 frames with over 6,700 3D instances across four large mammal species, the framework achieves high identity consistency in multi-object 3D tracking, substantially enhancing the structural richness of ecological data for downstream analysis.

Technology Category

Application Category

📝 Abstract
Monocular RGB cameras mounted on drones are widely used for wildlife monitoring, yet most analytical pipelines remain confined to two-dimensional image space, leaving geometric information in video underexploited. We present WildLIFT, a computational framework that integrates three-dimensional scene geometry from monocular drone video with open-vocabulary 2D instance segmentation to enable species-agnostic 3D detection and tracking. Oriented 3D bounding box labels with semantic face information enable quantitative assessment of viewpoint coverage and inter-animal occlusion, producing structured metadata for downstream ecological analyses. We validate the framework on 2,581 manually curated frames comprising over 6,700 3D detections across four large mammal species. WildLIFT maintains high identity consistency in multi-animal scenes and substantially reduces manual 3D annotation effort through keyframe-based refinement. By transforming standard drone footage into structured 3D and viewpoint-aware representations, WildLIFT extends the analytical utility of aerial wildlife datasets for behavioural research and population monitoring.
Problem

Research questions and friction points this paper is trying to address.

wildlife monitoring
monocular drone video
3D detection
species-agnostic
scene geometry
Innovation

Methods, ideas, or system contributions that make the work stand out.

monocular 3D reconstruction
species-agnostic detection
drone-based wildlife monitoring
oriented 3D bounding boxes
open-vocabulary segmentation
🔎 Similar Papers
No similar papers found.
V
Vandita Shukla
3D Optical Metrology, Fondazione Bruno Kessler, Trento, Italy; Computer Vision and Machine Learning Systems Group, University of Muenster, Muenster, Germany
Fabio Remondino
Fabio Remondino
3D Optical Metrology - Bruno Kessler Foundation
photogrammetry3D modelingAI
B
Blair Costelloe
Department of Collective Behaviour, Max Planck Institute of Animal Behaviour, Konstanz, Germany; Centre for the Advanced Study of Collective Behaviour, University of Konstanz, Konstanz, Germany; Department of Biology, University of Konstanz, Konstanz, Germany
Benjamin Risse
Benjamin Risse
Faculty of Mathematics & Computer Science, University of Münster, Germany
Computer VisionMachine LearningEcologyAdditive ManufacturingBiomedical Image Processing