Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses the general 3D rotation estimation problem for RGB images—requiring no category-specific training and supporting zero-shot transfer. Method: We propose a lightweight Transformer-based framework that introduces, for the first time, joint latent-space modeling across multiple reference images. Our approach integrates multi-reference feature fusion, rotation-aware latent representation learning, and end-to-end differentiable regression to directly predict the 3D rotation of a query image from several reference images with known poses in a single forward pass. Contribution/Results: The method achieves state-of-the-art accuracy on multiple benchmarks while reducing inference latency by over 40%, significantly enhancing deployability on edge devices. It simultaneously delivers strong generalization across unseen categories and low-latency inference—offering a favorable trade-off between robustness and efficiency.

Technology Category

Application Category

📝 Abstract

We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass, without requiring object- or category-specific training. At the core of our framework is a transformer that performs a comparison in the latent space, jointly processing rotation-aware representations from multiple references alongside a query. This design enables a favorable balance between accuracy and computational efficiency while remaining simple, scalable, and fully end-to-end. Experimental results show that Eff-GRot offers a promising direction toward more efficient rotation estimation, particularly in latency-sensitive applications.

Problem

Research questions and friction points this paper is trying to address.

Efficient rotation estimation from RGB images

Generalizable without object-specific training

Balances accuracy and computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based latent space comparison for rotation estimation

Single forward pass prediction without object-specific training

End-to-end scalable framework balancing accuracy and efficiency

🔎 Similar Papers

No similar papers found.