Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
4DIFF: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation, ECCV 2024
DAM: Dynamic Adapter Merging for Continual Video QA Learning, Preprint
Loconet: Long-short context network for active speaker detection, CVPR 2024
Ego-exo4d: Understanding Skilled Human Activity from First-and Third-Person Perspectives, CVPR 2024 (Oral)
Unified Coarse-to-Fine Alignment for Video-Text Retrieval, ICCV 2023
VindLU: A Recipe for Effective Video-and-Language Pretraining, CVPR 2023
TALLFormer: Temporal Action Localization with Long-memory Transformer, ECCV 2022
Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models, CVPR 2022 (Oral)
High-Resolution 3D Magnetic Resonance Fingerprinting With a Graph Convolutional Network, IEEE Transactions on Medical Imaging (2022)
Spatio-Temporal Fusion based Convolutional Sequence Learning for Lip Reading, ICCV 2019
Acceleration of High-Resolution 3D MR Fingerprinting via a Graph Convolutional Network, MICCAI 2020
Background
Currently a Ph.D. student in the Department of Computer Science at UNC at Chapel Hill, advised by Prof. Gedas Bertasius. Research interests include multimodal video understanding, such as video-language (VidL) pretraining, video LLM, video continual learning, and video generation.