Co-authors
20
list available
Resume (English only)
Academic Achievements
- - Released a new 7B VLM and large-scale dataset for video understanding (Oct 3, 2024).
- - Released the first vision (images and video)-spatial audio model as a step towards complete generation (Jun 18, 2023).
- - Published several papers on arXiv including VoMP, Squeeze3D, Can Vision-Language Models Answer Face to Face Questions in the Real-World?, SEE-2-SOUND, NeRF-US, etc.
Research Experience
- - Interned at Qualcomm AI Research in 2024 with Roland Memisevic and Guillaume Berger.
- - Interned at Civo in 2023 with Josh Mesout.
- - Previously worked on software engineering and robotics, extensively contributed to/maintained some popular open-source projects.
Education
- CS, Math Undergrad at UofT (University of Toronto).
Background
- Very interested in learning algorithms, computer vision, graphics, learning theory, and math (number theory and topology). Currently on a break from undergrad and working at NVIDIA on the intersection of AI, vision, and graphics research.
Miscellany
- Looking for a PhD position starting Fall 2026. The best way to reach out is Twitter.