Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low accuracy of monocular fisheye-based 3D human pose estimation caused by severe radial distortion, this paper systematically evaluates four projection models—pinhole, equidistant, double-sphere, and cylindrical—in terms of undistortion quality and 3D pose reconstruction performance. We propose a geometry-aware, adaptive projection model selection heuristic based on human detection bounding boxes, eliminating manual preselection. Furthermore, we introduce FISHnCHIPS, the first real-world fisheye dataset featuring extreme viewpoints and precise 3D annotations. Experiments demonstrate that the double-sphere model significantly improves absolute pose accuracy—especially for close-range and wide-field-of-view scenarios—reducing mean joint error by 12.7% over the pinhole model. These results underscore the critical importance of projection model adaptation for wide-FOV monocular 3D pose estimation.

Technology Category

Application Category

📝 Abstract
Fisheye cameras offer robots the ability to capture human movements across a wider field of view (FOV) than standard pinhole cameras, making them particularly useful for applications in human-robot interaction and automotive contexts. However, accurately detecting human poses in fisheye images is challenging due to the curved distortions inherent to fisheye optics. While various methods for undistorting fisheye images have been proposed, their effectiveness and limitations for poses that cover a wide FOV has not been systematically evaluated in the context of absolute human pose estimation from monocular fisheye images. To address this gap, we evaluate the impact of pinhole, equidistant and double sphere camera models, as well as cylindrical projection methods, on 3D human pose estimation accuracy. We find that in close-up scenarios, pinhole projection is inadequate, and the optimal projection method varies with the FOV covered by the human pose. The usage of advanced fisheye models like the double sphere model significantly enhances 3D human pose estimation accuracy. We propose a heuristic for selecting the appropriate projection model based on the detection bounding box to enhance prediction quality. Additionally, we introduce and evaluate on our novel dataset FISHnCHIPS, which features 3D human skeleton annotations in fisheye images, including images from unconventional angles, such as extreme close-ups, ground-mounted cameras, and wide-FOV poses, available at: https://www.vision.rwth-aachen.de/fishnchips
Problem

Research questions and friction points this paper is trying to address.

Evaluating projection methods for 3D human pose estimation on fisheye images
Assessing distortion impact on wide-FOV poses in monocular fisheye images
Proposing heuristic for optimal projection model selection based on bounding box
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates pinhole, equidistant, double sphere models
Proposes heuristic for selecting projection model
Introduces FISHnCHIPS dataset with fisheye annotations
🔎 Similar Papers
No similar papers found.
S
Stephanie Käs
Chair for Computer Vision, RWTH Aachen University, Germany
S
Sven Peter
Chair for Computer Vision, RWTH Aachen University, Germany
H
Henrik Thillmann
Chair for Computer Vision, RWTH Aachen University, Germany
A
Anton Burenko
Chair for Computer Vision, RWTH Aachen University, Germany
D
David Benjamin Adrian
Robert Bosch GmbH, Corporate Research & Bosch Center for AI, Renningen and Hildesheim, Germany
D
Dennis Mack
Robert Bosch GmbH, Corporate Research & Bosch Center for AI, Renningen and Hildesheim, Germany
Timm Linder
Timm Linder
Research Scientist 3D Robot Perception, Bosch Research
Computer Vision3D Scene UnderstandingHRIRoboticsAutonomous Systems
Bastian Leibe
Bastian Leibe
Professor for Computer Vision, RWTH Aachen University
Computer VisionObject RecognitionTrackingScene Understanding