Mining Attribute Subspaces for Efficient Fine-tuning of 3D Foundation Models

📅 2026-04-11

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the challenge of efficiently fine-tuning 3D foundation models for downstream tasks under complex variations in texture, geometry, camera motion, and illumination. The authors construct a controllable synthetic dataset to independently train LoRA adapters and extract low-rank subspaces corresponding to each attribute, providing the first empirical validation that these subspaces are approximately orthogonal and disentangled in 3D data. Building on this finding, they fuse multiple attribute-specific subspaces into a compact, shared LoRA subspace, which—despite being trained solely on synthetic data—demonstrates strong generalization to real-world scenes. The proposed method significantly enhances downstream task performance while maintaining high parameter efficiency, establishing a novel paradigm for effective transfer learning in 3D vision.

Technology Category

Application Category

📝 Abstract

With the emergence of 3D foundation models, there is growing interest in fine-tuning them for downstream tasks, where LoRA is the dominant fine-tuning paradigm. As 3D datasets exhibit distinct variations in texture, geometry, camera motion, and lighting, there are interesting fundamental questions: 1) Are there LoRA subspaces associated with each type of variation? 2) Are these subspaces disentangled (i.e., orthogonal to each other)? 3) How do we compute them effectively? This paper provides answers to all these questions. We introduce a robust approach that generates synthetic datasets with controlled variations, fine-tunes a LoRA adapter on each dataset, and extracts a LoRA sub-space associated with each type of variation. We show that these subspaces are approximately disentangled. Integrating them leads to a reduced LoRA subspace that enables efficient LoRA fine-tuning with improved prediction accuracy for downstream tasks. In particular, we show that such a reduced LoRA subspace, despite being derived entirely from synthetic data, generalizes to real datasets. An ablation study validates the effectiveness of the choices in our approach.

Problem

Research questions and friction points this paper is trying to address.

3D foundation models

LoRA

subspace disentanglement

efficient fine-tuning

attribute variations

Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA subspaces

disentangled representation

3D foundation models