Unleashing MLLMs on the Edge: A Unified Framework for Cross-Modal ReID via Adaptive SVD Distillation

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
Practical cloud-edge deployment of Cross-Modal Re-identification (CM-ReID) faces challenges due to maintaining a fragmented ecosystem of specialized cloud models for diverse modalities. While Multi-Modal Large Language Models (MLLMs) offer strong unification potential, existing approaches fail to adapt them into a single end-to-end backbone and lack effective knowledge distillation strategies for edge deployment. To address these limitations, we propose MLLMEmbed-ReID, a unified framework based on a powerful cloud-edge architecture. First, we adapt a foundational MLLM into a state-of-the-art cloud model. We leverage instruction-based prompting to guide the MLLM in generating a unified embedding space across RGB, infrared, sketch, and text modalities. This model is then trained efficiently with a hierarchical Low-Rank Adaptation finetuning (LoRA-SFT) strategy, optimized under a holistic cross-modal alignment objective. Second, to deploy its knowledge onto an edge-native student, we introduce a novel distillation strategy motivated by the low-rank property in the teacher's feature space. To prioritize essential information, this method employs a Principal Component Mapping loss, while relational structures are preserved via a Feature Relation loss. Our lightweight edge-based model achieves state-of-the-art performance on multiple visual CM-ReID benchmarks, while its cloud-based counterpart excels across all CM-ReID benchmarks. The MLLMEmbed-ReID framework thus presents a complete and effective solution for deploying unified MLLM-level intelligence on resource-constrained devices. The code and models will be open-sourced soon.
Problem

Research questions and friction points this paper is trying to address.

Cross-Modal ReID
Edge Deployment
Multi-Modal Large Language Models
Knowledge Distillation
Unified Framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

MLLM
Cross-Modal ReID
Knowledge Distillation
Low-Rank Adaptation
Edge Deployment
🔎 Similar Papers
No similar papers found.
Hongbo Jiang
Hongbo Jiang
Hunan University
Mobile ComputingWireless NetworkingPrivacy Preserving
Jie Li
Jie Li
China University of Mining and Technology
Emotion Recognition in Conversation
X
Xinqi Cai
Xiamen University, Media Analytics and Computing Lab, Department of Artificial Intelligence, School of Informatics, Xiamen, China
Tianyu Xie
Tianyu Xie
School of Mathematical Sciences, Peking University
Bayesian InferenceGenerative ModelsBayesian Phylogenetics
Y
Yunhang Shen
Tencent Youtu Lab, Shanghai, China
P
Pingyang Dai
Xiamen University, Media Analytics and Computing Lab, Department of Artificial Intelligence, School of Informatics, Xiamen, China
L
Liujuan Cao
Xiamen University, Media Analytics and Computing Lab, Department of Artificial Intelligence, School of Informatics, Xiamen, China