PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of robotic manipulation in heavily occluded 3D point clouds by proposing an end-to-end Transformer framework that jointly predicts 6-degree-of-freedom poses and task-relevant parameters—such as gripper aperture—for multiple object categories. The approach introduces a pose-aware parametrized perception paradigm, leveraging modular category-specific prediction heads to unify geometric localization and estimable manipulable attributes. This design enables seamless extension to novel object types without architectural redesign. Trained exclusively on synthetic data, the method achieves a mean Average Precision (mAP) of 0.919 on real-world outdoor LiDAR scans and has been successfully deployed on an autonomous forklift platform, demonstrating strong generalization and practical utility in real-world conditions.

Technology Category

Application Category

📝 Abstract
We present PIRATR, an end-to-end 3D object detection framework for robotic use cases in point clouds. Extending PI3DETR, our method streamlines parametric 3D object detection by jointly estimating multi-class 6-DoF poses and class-specific parametric attributes directly from occlusion-affected point cloud data. This formulation enables not only geometric localization but also the estimation of task-relevant properties for parametric objects, such as a gripper's opening, where the 3D model is adjusted according to simple, predefined rules. The architecture employs modular, class-specific heads, making it straightforward to extend to novel object types without re-designing the pipeline. We validate PIRATR on an automated forklift platform, focusing on three structurally and functionally diverse categories: crane grippers, loading platforms, and pallets. Trained entirely in a synthetic environment, PIRATR generalizes effectively to real outdoor LiDAR scans, achieving a detection mAP of 0.919 without additional fine-tuning. PIRATR establishes a new paradigm of pose-aware, parameterized perception. This bridges the gap between low-level geometric reasoning and actionable world models, paving the way for scalable, simulation-trained perception systems that can be deployed in dynamic robotic environments. Code available at https://github.com/swingaxe/piratr.
Problem

Research questions and friction points this paper is trying to address.

3D object detection
parametric objects
6-DoF pose estimation
point clouds
robotic perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

parametric object detection
6-DoF pose estimation
point cloud perception
simulation-to-reality transfer
modular transformer architecture
🔎 Similar Papers
No similar papers found.
Michael Schwingshackl
Michael Schwingshackl
AIT Austrian Institute of Technology
Machine LearningComputer VisionSynthetic Data Generation
F
Fabio F. Oberweger
AIT Austrian Institute of Technology, Center for Vision, Automation & Control
M
Mario Niedermeyer
AIT Austrian Institute of Technology, Center for Vision, Automation & Control
H
Huemer Johannes
AIT Austrian Institute of Technology, Center for Vision, Automation & Control
Markus Murschitz
Markus Murschitz
AIT - Austrian Institute of Technology
Computer VisionNatural Language ProcessingDomain Modelling