Online-PVLM: Advancing Personalized VLMs with Online Concept Learning

πŸ“… 2025-11-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing personalized vision-language models (VLMs) require separate embedding learning for each new concept, hindering real-time test-time adaptation and scalable, efficient retrieval. Method: We propose the first online concept learning framework for personalized VLMs, leveraging hyperbolic space to model semantic hierarchies and enabling zero-shot concept embedding generation; we further introduce OP-Evalβ€”a dynamically updatable, large-scale evaluation benchmark covering cross-domain retrieval and diverse question-answering tasks. Contribution/Results: Experiments demonstrate significant improvements over baselines on OP-Eval, with millisecond-level concept insertion and retrieval. The framework achieves high efficiency, scalability, and practicality, establishing a novel paradigm for deploying personalized VLMs in real-world applications.

Technology Category

Application Category

πŸ“ Abstract
Personalized Visual Language Models (VLMs) are gaining increasing attention for their formidable ability in user-specific concepts aligned interactions (e.g., identifying a user's bike). Existing methods typically require the learning of separate embeddings for each new concept, which fails to support real-time adaptation during testing. This limitation becomes particularly pronounced in large-scale scenarios, where efficient retrieval of concept embeddings is not achievable. To alleviate this gap, we propose Online-PVLM, a framework for online concept learning by leveraging hyperbolic representations. Our approach makes a train-free paradigm for concept embeddings generation at test time, making the use of personalized VLMs both scalable and efficient. In addition, we develop OP-Eval, a comprehensive and large-scale benchmark comprising 1,292 concepts and over 30K high-quality instances with diverse question types, designed to rigorously assess online concept learning in realistic scenarios. Extensive experiments demonstrate the state-of-the-art performance of our proposed framework. Our source code and dataset will be made available.
Problem

Research questions and friction points this paper is trying to address.

Enables real-time concept learning during testing phase
Eliminates need for separate embedding training per concept
Solves scalability issues in large-scale personalized VLM deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online concept learning with hyperbolic representations
Train-free paradigm for embedding generation
Scalable and efficient personalized VLM framework
πŸ”Ž Similar Papers
No similar papers found.
H
Huiyu Bai
Nanyang Technological University
R
Runze Wang
Alibaba Group
Z
Zhuoyun Du
Zhejiang University
Yiyang Zhao
Yiyang Zhao
Ingdan Labs
Internet of ThingsMobile Computing
Fengji Zhang
Fengji Zhang
Department of Computer Science, City University of Hong Kong
Software EngineeringLarge Language Models
H
Haoyu Chen
University of Oulu
Xiaoyong Zhu
Xiaoyong Zhu
Jiangsu University
Electrical MachinesElectrical Vehicle
B
Bo Zheng
Alibaba Group
X
Xuejiao Zhao
Nanyang Technological University