Foundry: Distilling 3D Foundation Models for the Edge

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying large 3D foundation models on resource-constrained edge devices remains challenging, and existing compression methods often degrade their cross-task generalization capability. Method: This paper proposes Foundation Model Distillation (FMD), the first FMD framework tailored for 3D point clouds. It employs token-level knowledge distillation to guide a lightweight student model in learning a compact basis from the teacher’s self-supervised representation space and generating reconstructive “SuperTokens” that faithfully recover teacher features. Contribution/Results: FMD reduces token count (>80%) and computational overhead significantly while fully preserving the original model’s general-purpose representation capability across downstream tasks—including classification, segmentation, and few-shot transfer. Experiments demonstrate that a single distilled student model achieves near-teacher performance and is deployable on edge devices such as resource-limited robots and AR/VR systems—establishing the first efficient, general-purpose distillation solution for edge-deployable 3D foundation models.

Technology Category

Application Category

📝 Abstract
Foundation models pre-trained with self-supervised learning (SSL) on large-scale datasets have become powerful general-purpose feature extractors. However, their immense size and computational cost make them prohibitive for deployment on edge devices such as robots and AR/VR headsets. Existing compression techniques like standard knowledge distillation create efficient 'specialist' models but sacrifice the crucial, downstream-agnostic generality that makes foundation models so valuable. In this paper, we introduce Foundation Model Distillation (FMD), a new paradigm for compressing large SSL models into compact, efficient, and faithful proxies that retain their general-purpose representational power. We present Foundry, the first implementation of FMD for 3D point clouds. Our approach, Foundry, trains a student to learn a compressed set of SuperTokens that reconstruct the teacher's token-level representations, capturing a compact basis of its latent space. A single distilled model maintains strong transferability across diverse downstream tasks-classification, part segmentation, and few-shot scenarios-approaching full foundation-model performance while using significantly fewer tokens and FLOPs, making such models more practical for deployment on resourceconstrained hardware.
Problem

Research questions and friction points this paper is trying to address.

Compressing large 3D foundation models for edge device deployment
Preserving general-purpose representational power during model distillation
Maintaining performance across diverse tasks with reduced computational costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distilling 3D foundation models for edge devices
Compressing models via SuperTokens reconstruction
Maintaining generalizability across diverse downstream tasks
🔎 Similar Papers
No similar papers found.
G
Guillaume Letellier
GREYC, Normandy University, Unicaen, ENSICAEN, UMR CNRS 6072, F-14000 Caen, France
Siddharth Srivastava
Siddharth Srivastava
Arizona State University
Artificial IntelligenceAutomated PlanningRoboticsTask and Motion PlanningAI Assessment
F
Frédéric Jurie
GREYC, Normandy University, Unicaen, ENSICAEN, UMR CNRS 6072, F-14000 Caen, France
G
Gaurav Sharma
IIT Kanpur