Rethinking Camera Choice: An Empirical Study on Fisheye Camera Properties in Robotic Manipulation

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study addresses the lack of systematic understanding regarding the downstream impact of fisheye cameras in robot imitation learning, particularly concerning spatial localization, scene generalization, and hardware generalization. For the first time, it comprehensively evaluates—through both simulation and real-world experiments—how the wide field of view of wrist-mounted fisheye cameras influences policy learning. To mitigate cross-camera transfer failures, the work introduces Random Scale Augmentation (RSA). Results demonstrate that the wide field of view significantly enhances spatial localization—contingent on environmental complexity—while diverse training environments improve scene generalization. Moreover, RSA effectively boosts hardware generalization across different camera systems. This work provides critical empirical evidence and practical solutions for deploying fisheye vision in robotic manipulation tasks.

Technology Category

Application Category

📝 Abstract

The adoption of fisheye cameras in robotic manipulation, driven by their exceptionally wide Field of View (FoV), is rapidly outpacing a systematic understanding of their downstream effects on policy learning. This paper presents the first comprehensive empirical study to bridge this gap, rigorously analyzing the properties of wrist-mounted fisheye cameras for imitation learning. Through extensive experiments in both simulation and the real world, we investigate three critical research questions: spatial localization, scene generalization, and hardware generalization. Our investigation reveals that: (1) The wide FoV significantly enhances spatial localization, but this benefit is critically contingent on the visual complexity of the environment. (2) Fisheye-trained policies, while prone to overfitting in simple scenes, unlock superior scene generalization when trained with sufficient environmental diversity. (3) While naive cross-camera transfer leads to failures, we identify the root cause as scale overfitting and demonstrate that hardware generalization performance can be improved with a simple Random Scale Augmentation (RSA) strategy. Collectively, our findings provide concrete, actionable guidance for the large-scale collection and effective use of fisheye datasets in robotic learning. More results and videos are available on https://robo-fisheye.github.io/

Problem

Research questions and friction points this paper is trying to address.

fisheye camera

robotic manipulation

spatial localization

scene generalization

hardware generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

fisheye camera

robotic manipulation

imitation learning