🤖 AI Summary
This work addresses fine-grained material attribute recognition from monocular indoor images to enhance consumer robots’ perception of object physical properties. To this end, we introduce MatIndoor—the first synthetic benchmark tailored to indoor environments—integrating Replica’s high-fidelity 3D geometry with MatSynth’s fine-grained material semantics and rendered via Blender’s physically-based renderer to generate multi-view, multi-illumination images across 18 object categories and 14 material classes, ensuring geometric-material physical consistency. We propose a unified evaluation framework covering material classification and cross-material retrieval, and establish CNN- and ViT-based baselines. The dataset, code, and evaluation toolkit are publicly released, providing a reproducible and extensible foundation for vision-driven material understanding.
📝 Abstract
Determining material properties from camera images can expand the ability to identify complex objects in indoor environments, which is valuable for consumer robotics applications. To support this, we introduce MatPredict, a dataset that combines the high-quality synthetic objects from Replica dataset with MatSynth dataset's material properties classes - to create objects with diverse material properties. We select 3D meshes of specific foreground objects and render them with different material properties. In total, we generate extbf{18} commonly occurring objects with extbf{14} different materials. We showcase how we provide variability in terms of lighting and camera placement for these objects. Next, we provide a benchmark for inferring material properties from visual images using these perturbed models in the scene, discussing the specific neural network models involved and their performance based on different image comparison metrics. By accurately simulating light interactions with different materials, we can enhance realism, which is crucial for training models effectively through large-scale simulations. This research aims to revolutionize perception in consumer robotics. The dataset is provided href{https://huggingface.co/datasets/UMTRI/MatPredict}{here} and the code is provided href{https://github.com/arpan-kusari/MatPredict}{here}.