XDen-1K: A Density Field Dataset of Real-World Objects

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D models typically neglect internal physical properties—such as volumetric density—hindering centroid estimation, stability analysis, and robotic manipulation. To address this, we introduce XDen-1K, the first large-scale, real-world volumetric density field dataset, comprising 1,000 high-fidelity 3D models spanning 148 object categories, each paired with dual-plane X-ray scans. We present the first method to reconstruct high-fidelity volumetric density fields from sparse X-ray projections. Our approach integrates multi-modal acquisition (X-ray imaging, geometry, and part annotations), optimization-based density reconstruction, and an X-ray-conditioned segmentation network enabling density-driven, physics-aware part segmentation. We further propose a novel physics-informed robotic manipulation paradigm grounded in estimated density fields. Validated through both physics simulation and real-robot experiments, our method reduces centroid estimation error by 42% and improves grasp success rate by 27%. XDen-1K establishes a generalizable, physically grounded representation benchmark for embodied AI.

Technology Category

Application Category

📝 Abstract
A deep understanding of the physical world is a central goal for embodied AI and realistic simulation. While current models excel at capturing an object's surface geometry and appearance, they largely neglect its internal physical properties. This omission is critical, as properties like volumetric density are fundamental for predicting an object's center of mass, stability, and interaction dynamics in applications ranging from robotic manipulation to physical simulation. The primary bottleneck has been the absence of large-scale, real-world data. To bridge this gap, we introduce XDen-1K, the first large-scale, multi-modal dataset designed for real-world physical property estimation, with a particular focus on volumetric density. The core of this dataset consists of 1,000 real-world objects across 148 categories, for which we provide comprehensive multi-modal data, including a high-resolution 3D geometric model with part-level annotations and a corresponding set of real-world biplanar X-ray scans. Building upon this data, we introduce a novel optimization framework that recovers a high-fidelity volumetric density field of each object from its sparse X-ray views. To demonstrate its practical value, we add X-ray images as a conditioning signal to an existing segmentation network and perform volumetric segmentation. Furthermore, we conduct experiments on downstream robotics tasks. The results show that leveraging the dataset can effectively improve the accuracy of center-of-mass estimation and the success rate of robotic manipulation. We believe XDen-1K will serve as a foundational resource and a challenging new benchmark, catalyzing future research in physically grounded visual inference and embodied AI.
Problem

Research questions and friction points this paper is trying to address.

Estimates volumetric density of real-world objects from X-ray scans
Addresses lack of large-scale data for physical property estimation
Improves center-of-mass prediction and robotic manipulation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset with multi-modal data for density estimation
Optimization framework recovers density fields from X-rays
X-ray conditioning improves segmentation and robotics tasks
🔎 Similar Papers
No similar papers found.
J
Jingxuan Zhang
ShanghaiTech University
T
Tianqi Yu
ShanghaiTech University
Y
Yatu Zhang
ShanghaiTech University
J
Jinze Wu
ShanghaiTech University
K
Kaixin Yao
ShanghaiTech University
J
Jingyang Liu
ShanghaiTech University
Yuyao Zhang
Yuyao Zhang
Renmin University of China
Artificial Intelligence
Jiayuan Gu
Jiayuan Gu
Assistant Professor, ShanghaiTech University
Embodied AI3D Vision
Jingyi Yu
Jingyi Yu
Professor, ShanghaiTech University
Computer VisionComputer Graphics