Zeeshan Khan
Scholar

Zeeshan Khan

Google Scholar ID: uvhBVYoAAAAJ
INRIA Paris
Computer VisionVision and LanguageDeep Learning
Citations & Impact
All-time
Citations
106
 
H-index
6
 
i10-index
2
 
Publications
10
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Papers published in top conferences like CVPR 2025, CVPR 2024, NeurIPS 2022; Involved in developing ComposeAnything framework for compositional text-to-image generation, VELOCITI benchmark to evaluate video-language models, MICap model for identity-aware movie descriptions, and Grounded Video Situation Recognition framework.
Research Experience
  • Research Assistant in the Computer Vision lab at IIT Gandhinagar, working with Shanmuganathan Raman, focused on Computational Photography, specifically in high dynamic range (HDR) image and video reconstruction.
Education
  • PhD: Willow team at Inria and École Normale Supérieure in Paris, advised by Cordelia Schmid and Shizhe Chen; Master's: CVIT IIIT Hyderabad, advised by C.V. Jawahar and Makarand Tapaswi, thesis on Situation Recognition for Holistic Video Understanding.
Background
  • Research Interests: Unified large multimodal diffusion models, particularly at the intersection of vision and language for joint understanding and generation across text, images, and videos. Currently exploring compositional representations for high fidelity and interpretable text-to-image/video diffusion models.
Miscellany
  • Contact: zeeshan.khan@inria.fr; Office: C-412; Address: 2 Rue Simone IFF, 75012 Paris France
Co-authors
0 total
Co-authors: 0 (list not available)