Paper 'VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding' accepted to CVPR 2025 (equal contribution)
Paper 'VideoRAG: Retrieval-Augmented Generation over Video Corpus' accepted to ACL Findings 2025 (equal contribution)
Preprint 'UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities' released on arXiv in 2025 (equal contribution)
Preprint 'HoliSafe: Holistic Safety Benchmarking and Modeling with Safety Meta Token for Vision-Language Model' released on arXiv in 2025
Recipient of the Qualcomm-KAIST Innovation Award in 2023
Dean’s List, College of Engineering, KAIST in 2020
Invited talks on VideoRAG at Multimodal Weekly (hosted by TwelveLabs) and NYU Global AI Frontier Lab in 2025
Background
Ph.D. student at the Graduate School of AI, KAIST, member of the MLAI lab
Advised by Prof. Sung Ju Hwang
Research focuses on developing multimodal large language models (MLLMs) that understand the world and interact with humans through visual data
Previously worked on video understanding and multimodal Retrieval-Augmented Generation (RAG)
Interested in embodied AI models operating on egocentric video requiring spatial reasoning