Paper 'Zero-Shot Monocular Scene Flow Estimation in the Wild' accepted to CVPR 2025 as an Oral Presentation (Award Candidate, 0.48%); 'Monocular Dynamic Gaussian Splatting is Fast and Brittle and Scene Complexity Rules' accepted to TMLR 2025; organized the End-to-End 3D Learning (E2E3D) workshop at ICCV 2025; gave an invited talk at Harvard Visual Computing Group; presented a lightning talk at NYC Computer Vision Day 2025.
Research Experience
Research intern at NVIDIA Research, collaborating with Abhishek Badki, Hang Su, and Orazio Gallo; research intern at Meta Reality Labs, working with Numair Khan, Lei Xiao, and Douglas Lanman.
Education
PhD candidate in Computer Science at Brown University, advised by Professor James Tompkin; Master in Computer Science from Columbia University, advised by Professor Shuran Song and Professor Shih-Fu Chang; Bachelor in Computer Science from Fudan University, with a visiting student experience at MIT EECS (CSAIL).
Background
Research interests include Foundation Models, Multimodal LLMs, Video Generation, Reinforcement Learning, World Models, and Machine Learning. Aiming to endow artificial agents with spatiotemporal intelligence, blending vision, geometry, and high-level reasoning for real-time perception, prediction, and planning.