Kalib: Easy Hand-Eye Calibration with Reference Point Tracking

📅 2024-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional hand-eye calibration relies on artificial markers, while deep learning approaches require task-specific retraining or precise 3D CAD models—limiting adaptability in unstructured environments. Method: We propose a marker-free, training-free, and mesh-free calibration framework that requires only the robot’s kinematic model and a single predefined reference point. Leveraging vision foundation models (e.g., ViT/CLIP) for feature extraction and monocular tracking, our method jointly exploits forward kinematics and EPnP to estimate the hand-eye transformation. Contribution/Results: This is the first work to harness the zero-shot generalization capability of vision foundation models for hand-eye calibration, eliminating dependence on custom networks, calibration boards, or CAD models. The method enables continuous operation in unstructured settings. Extensive simulations and real-world experiments demonstrate accuracy competitive with state-of-the-art methods and reduce human intervention by over 50%. Deployed across multiple robotic arms and gripper platforms, it validates robustness and plug-and-play usability.

Technology Category

Application Category

📝 Abstract
Hand-eye calibration aims to estimate the transformation between a camera and a robot. Traditional methods rely on fiducial markers, which require considerable manual effort and precise setup. Recent advances in deep learning have introduced markerless techniques but come with more prerequisites, such as retraining networks for each robot, and accessing accurate mesh models for data generation. In this paper, we propose Kalib, an automatic and easy-to-setup hand-eye calibration method that leverages the generalizability of visual foundation models to overcome these challenges. It features only two basic prerequisites, the robot's kinematic chain and a predefined reference point on the robot. During calibration, the reference point is tracked in the camera space. Its corresponding 3D coordinates in the robot coordinate can be inferred by forward kinematics. Then, a PnP solver directly estimates the transformation between the camera and the robot without training new networks or accessing mesh models. Evaluations in simulated and real-world benchmarks show that Kalib achieves good accuracy with a lower manual workload compared with recent baseline methods. We also demonstrate its application in multiple real-world settings with various robot arms and grippers. Kalib's user-friendly design and minimal setup requirements make it a possible solution for continuous operation in unstructured environments.
Problem

Research questions and friction points this paper is trying to address.

Estimates camera-robot transformation without fiducial markers
Reduces manual effort and setup complexity in calibration
Eliminates need for retraining networks or mesh models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses visual foundation models for generalization
Tracks reference point for automatic calibration
Requires only kinematic chain and reference point
🔎 Similar Papers
No similar papers found.
Tutian Tang
Tutian Tang
Shanghai Jiao Tong University
Robotics
M
Minghao Liu
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Wenqiang Xu
Wenqiang Xu
Shanghai Jiao Tong University
Computer visionRobotics
C
Cewu Lu
Qing Yuan Research Institute and MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China