Co-authors
26
list available
Resume (English only)
Academic Achievements
- NAACL 2025: 'Evaluating Multimodal Generative AI with Korean Educational Standards' – introduced KoNET, a Korean multimodal benchmark based on standardized tests
- ICLR 2025: 'How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?' – proposed a weight merging method to preserve safety and capability
- EMNLP 2024: 'On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding' – developed a more efficient VLM
- ACL 2024 Findings: 'Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation' – created an open-source VLM for fine-grained evaluation
- EMNLP 2023: 'Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models' – introduced the Cream model for document understanding
Research Experience
- Applied Research Scientist and Tech Lead at NAVER Cloud, managing multiple R&D projects
- Led open-source projects including Donut, Webvicob, and Cream
- Developed LLM-based multimodal products such as HyperCLOVA X Vision
- Conducting Ph.D. research at KAIST AI on multimodal machine learning
- Part-time lecturer at the University of Seoul teaching AI engineering
Background
- Currently serving as an Applied Research Scientist and Technical Leader (Tech Lead) at NAVER Cloud
- Research interests broadly span Multimodal Machine Learning (Vision, Language, and Audio)
- Conducts research and software engineering for NAVER's LLM-based multimodal solutions and products (e.g., HyperCLOVA X Vision)
- Pursuing a Ph.D. in Artificial Intelligence at KAIST AI under the supervision of Prof. Minjoon Seo
- Serves as a part-time lecturer at the University of Seoul, teaching 'AI Engineering in Production'
- Focuses on building robust and generalizable machine learning systems for real-world applications