Scholar

Rui Qian

Google Scholar ID: QehSWiQAAAAJ

The Chinese University of Hong Kong

Computer vision

Citations & Impact

All-time

Citations

1,941

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

20 items

Browse publications on Google Scholar (top-right) ↗

Resume (English only)

Academic Achievements

Papers Published:
- Two papers accepted to CVPR 2025.
- One paper accepted to NeurIPS 2024.
- Two papers accepted to ECCV 2024.
- Two papers accepted to ICCV 2023.
- One paper accepted to CVPR 2023.
Preprints:
- SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
Publications:
- Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
- Streaming Long Video Understanding with Large Language Models
- Rethinking Image-to-Video Adaptation: An Object-centric Perspective
- Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation
- Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
- Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
- Static and Dynamic Concepts for Self-supervised Video Representation Learning
- Dual Contrastive Learning for Spatio-temporal Representation
- Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

Research Experience

Published papers in several top-tier international conferences (e.g., CVPR, NeurIPS, ECCV, ICCV) and participated in multiple research projects.

Education

Ph.D. candidate at the Multi-Media Lab of The Chinese University of Hong Kong, supervised by Prof. Dahua Lin; Bachelor's degree from the School of Electronic Information and Electrical Engineering at Shanghai Jiao Tong University, supervised by Prof. Weiyao Lin. During his undergraduate, he also interned at Sensetime OpenMMLab group, supervised by Dr. Kai Chen, and worked with Prof. Di Hu.

Background

Research Interests: Computer vision and machine learning, especially self-supervised learning, video understanding, and multi-modal large language models.

Co-authors

11 total