- Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data
- PatentLMM: Large Multimodal Model for Generating Descriptions for Patent Figures
- Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
- Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification;
Reviewer for CVPR 2025, ACL ARR 2024-2025;
Teaching Assistant for CSL 2010: Introduction to Machine Learning at IIT Jodhpur, Fall 2022.
Research Experience
Research Engineer at SpreeAI, working on developing new training schemes and architectures for virtual try-on in close collaboration with Dr. Aayush Bansal and Dr. Minh Vo; Part of the Vision, Language and Learning Group (VL2G) at IIT Jodhpur, where he worked on 2D generative models, representation learning, and multi-modal LLMs.
Education
Bachelor's in Technology in AI and Data Science from IIT Jodhpur in 2024; Advisor: Prof. Anand Mishra.
Background
Research interests: Improving representation learning for various downstream tasks, and multi-modal generative modeling.
Miscellany
Academic service includes reviewing for top conferences.