Foundry: Distilling 3D Foundation Models for the Edge

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

260K/year

🤖 AI Summary

Deploying large 3D foundation models on resource-constrained edge devices remains challenging, and existing compression methods often degrade their cross-task generalization capability. Method: This paper proposes Foundation Model Distillation (FMD), the first FMD framework tailored for 3D point clouds. It employs token-level knowledge distillation to guide a lightweight student model in learning a compact basis from the teacher’s self-supervised representation space and generating reconstructive “SuperTokens” that faithfully recover teacher features. Contribution/Results: FMD reduces token count (>80%) and computational overhead significantly while fully preserving the original model’s general-purpose representation capability across downstream tasks—including classification, segmentation, and few-shot transfer. Experiments demonstrate that a single distilled student model achieves near-teacher performance and is deployable on edge devices such as resource-limited robots and AR/VR systems—establishing the first efficient, general-purpose distillation solution for edge-deployable 3D foundation models.

Technology Category

Application Category

📝 Abstract

Foundation models pre-trained with self-supervised learning (SSL) on large-scale datasets have become powerful general-purpose feature extractors. However, their immense size and computational cost make them prohibitive for deployment on edge devices such as robots and AR/VR headsets. Existing compression techniques like standard knowledge distillation create efficient 'specialist' models but sacrifice the crucial, downstream-agnostic generality that makes foundation models so valuable. In this paper, we introduce Foundation Model Distillation (FMD), a new paradigm for compressing large SSL models into compact, efficient, and faithful proxies that retain their general-purpose representational power. We present Foundry, the first implementation of FMD for 3D point clouds. Our approach, Foundry, trains a student to learn a compressed set of SuperTokens that reconstruct the teacher's token-level representations, capturing a compact basis of its latent space. A single distilled model maintains strong transferability across diverse downstream tasks-classification, part segmentation, and few-shot scenarios-approaching full foundation-model performance while using significantly fewer tokens and FLOPs, making such models more practical for deployment on resourceconstrained hardware.

Problem

Research questions and friction points this paper is trying to address.

Compressing large 3D foundation models for edge device deployment

Preserving general-purpose representational power during model distillation

Maintaining performance across diverse tasks with reduced computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distilling 3D foundation models for edge devices

Compressing models via SuperTokens reconstruction

Maintaining generalizability across diverse downstream tasks

🔎 Similar Papers

No similar papers found.