Released our latest unified multimodal understanding and generation foundation model Ovis-U1 (2025/06).
Paper 'MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes' accepted to ICCV 2025 (2025/06).
Paper 'Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities' released (2025/05).
Paper 'HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration' accepted to ICML 2025 (2025/05).
Paper 'Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior' accepted to ICLR 2025 (2025/01).
Paper 'PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition' accepted to ICLR 2025 (2025/01).
For full publication list, refer to his Google Scholar Profile.
Education
Received B.Eng from School of Electronic Information and Engineering, South China University of Science and Technology (SCUT) in 2021; currently a PhD candidate at the Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology (HKUST), supervised by Prof. Jun Zhang.
Background
Research interests include Multimodal Understanding and Generation, Neural Data Representation & Compression, and Edge AI. Currently a PhD student at the Department of Electronic and Computer Engineering, HKUST, expected to graduate in 2025, open to both academic and industrial research positions in North America and Asia.
Miscellany
Welcome to contact for any discussion and cooperation!