- One paper on Event Graph based Interpretable VideoQA accepted to NeurIPS MAR Workshop, 2024.
- Our paper on Multimodal Reasoning on Generated Images was accepted at NeurIPS’24.
- Our paper on Entity-Aware Video Captioning was accepted at EMNLP’24.
- One paper on Procedure Planning accepted at ECCV’24.
- One paper on insufficient context in Multimodal Reasoning accepted at ACM MM’24.
- One paper accepted to AAAI 2024!
- One paper accepted to EMNLP Findings 2023!
Research Experience
Interned at Microsoft, Google, and Adobe, working with Jianwei Yang, Oriana Riva, Tianqi Liu, Arsha Nagrani, Mingda Zhang, Anurag Arnab, and Vlad Morariu.
Education
PhD student at the Dept. of Computer Science, Columbia University, advised by Prof. Shih-Fu Chang; Master’s from UC San Diego, advised by Prof. Gary Cottrell, and also worked with Prof. Manmohan Chandraker and Prof. David Kriegman; Bachelor’s from Indian Institute of Technology, BHU (IIT, BHU).
Background
Research interests include Computer Vision, Natural Language Processing, and Commonsense Reasoning. Particularly interested in building systems that can reason about our world in an interpretable, robust, and trustworthy manner. This involves extensive work with LLMs, agents, tools, and instruction-tuning.
Miscellany
On industry job market. If you are interested, feel free to reach out.