NeurIPS 2025: Paper 'MAGNET' (multi-agent framework for audio-visual RAG) accepted; recognized as Top Reviewer
ICCV 2025: Three papers ('Aurelia', 'AVTrustBench', 'EgoAdapt') accepted; selected for Doctoral Consortium; co-organizing Gen4AVC workshop; invited oral presentations at CLVL, AVGenL, and BinEgo‑360° workshops
ECCV 2024: Work on Audio-Visual LLM accepted
CVPR 2024: Paper 'MeLFusion' accepted (Highlight, Top 2.8%)
ACL 2024 Findings: Paper on robustness against spurious correlations accepted
NAACL 2024: Paper on LLM-guided navigational instruction generation accepted
Nature Scientific Reports: Paper on inferring perceived audience intent from multi-modal social media posts accepted
ICCV 2023: Paper 'AdVerb' accepted; invited talk at AV4D Workshop
EMNLP 2023: Paper 'APoLLo' accepted
Invited talks at CVPR, ICCV, NYU, University of Rochester AIR Lab, and other venues
Research Experience
ML Research Intern at Apple MLR (since Mar 2025), hosted by Chun-Liang Li and Karren Yang
Research Scientist Intern at Meta Reality Labs (Summer 2024), hosted by Ruohan Gao
Student Researcher at Google Research (since Feb 2024), working on speech-driven facial synthesis in the Talking Heads team with Avisek Lahiri and Vivek Kwatra
PhD Research Intern at Adobe Research (since May 2023), working on multi-modal audio generation with Joseph K J in the Multi-modal AI team
Collaborated with Prof. Kristen Grauman, Prof. Salman Khan, Prof. Mohamed Elhoseiny, and other mentors
Machine Learning Scientist at ShareChat (India), Camera and Video AI team, prior to PhD
Visiting Researcher at Computer Vision and Pattern Recognition Unit, Indian Statistical Institute Kolkata, under Prof. Ujjwal Bhattacharya
Senior Research Engineer at Samsung R&D Institute Bangalore, Vision Intelligence Group, developing AI solutions for Samsung smart devices