Learngene Search Across Multiple Datasets for Building Variable-Sized Models

📅 2026-05-06
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
Existing learngene methods are constrained by extraction from a single dataset, limiting their generalizability, and multi-scale model deployment incurs substantial pretraining and fine-tuning costs. This work proposes the LSAMD framework, which for the first time enables joint learngene search across multiple datasets. By constructing a searchable hyper-ancestral network comprising dataset-specific modules and adapters (DADs), LSAMD jointly optimizes architectural paths and extracts high-frequency shared base modules—identified via module frequency statistics—as universal learngenes. These learngenes facilitate efficient initialization of subnetworks at varying scales. The approach achieves performance on par with conventional pretraining–fine-tuning pipelines across multiple downstream tasks while significantly reducing storage and training overhead.
📝 Abstract
Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected across datasets are extracted as learngenes for initializing Des-Nets.Experiments on multiple datasets show that LSAMD achieves performance comparable to pretrain-finetune methods while significantly reducing storage and training costs.
Problem

Research questions and friction points this paper is trying to address.

learngene
multi-dataset
variable-sized models
transfer learning
model deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learngene
Multi-dataset Search
Variable-Sized Models
Dataset Adapters
Transferable Components
🔎 Similar Papers
No similar papers found.
B
Boyu Shi
School of Computer Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
J
Junbo Zhou
School of Computer Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
C
Chang Liu
School of Computer Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
X
Xu Yang
School of Computer Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
Qiufeng Wang
Qiufeng Wang
Southeast University
machine learningfoundation model
Xin Geng
Xin Geng
School of Computer Science and Engineering, Southeast University
Artificial IntelligencePattern RecognitionMachine Learning