Scholar

Dahun Kim

Google Scholar ID: mHpN1xoAAAAJ

Research Scientist, Google DeepMind

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,323

H-index

i10-index

Publications

Co-authors

list available

Contact

CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

Unlocking Multi-Spectral Data for Multi-Modal Models with Guided Inputs and Chain-of-Thought Reasoning

2026

Cited

Taking Shortcuts for Categorical VQA Using Super Neurons

2026

Cited

EmbeddingGemma: Powerful and Lightweight Text Representations

2025

Cited

Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

2025

Cited

Time-Scaling State-Space Models for Dense Video Captioning

2025

Cited

Context-Adaptive Multi-Prompt LLM Embedding for Vision-Language Alignment

2025

Cited

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models

2025

Cited

Resume (English only)

Academic Achievements

[{'Type': 'Paper', 'Title': 'EmbeddingGemma: Powerful and Lightweight Text Representations', 'Authors': 'Gemini Embedding Team, Google', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Context-Adaptive Multi-Prompt Embedding with Large Language Models for Vision-Language Alignment', 'Authors': 'Dahun Kim, Anelia Angelova', 'Conference': 'COLM 2025', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities', 'Authors': 'Gemini Team, Google', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Time-Scaling State-Space Models for Dense Video Captioning', 'Authors': 'AJ Piergiovanni, Ganesh Mallya, Dahun Kim, Anelia Angelova', 'Conference': 'BMVC 2025', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models', 'Authors': 'Dahun Kim, AJ Piergiovanni, Ganesh Mallya, Anelia Angelova', 'Conference': 'CVPR 2025', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications', 'Authors': 'Ganesh Mallya, Yotam Gigi, Dahun Kim, Maxim Neumann, Genady Beryozkin, Tomer Shekel, Anelia Angelova', 'Conference': 'AGU 2025', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Whats in a Video: Factorized Autoregressive Decoding for Online Dense Video Captioning', 'Authors': 'AJ Piergiovanni, Dahun Kim, Michael S Ryoo, Isaac Noble, Anelia Angelova', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Learning Visual Grounding from Generative Vision and Language Model', 'Authors': 'Shijie Wang, Dahun Kim, Ali Taalimi, Chen Sun, Weicheng Kuo', 'Conference': 'WACV 2025', 'Year': '2025'}, {'Type': 'Paper', 'Title': 'Region-centric Image-Language Pretraining for Open-Vocabulary Detection', 'Authors': 'Dahun Kim, Anelia Angelova, Weicheng Kuo', 'Conference': 'ECCV 2024', 'Year': '2024'}, {'Type': 'Paper', 'Title': 'Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation', 'Authors': 'Minsu Kim, Jeongsoo Choi, Dahun Kim, Yong Man Roh', 'Journal': 'TASLP 2024', 'Year': '2024'}, {'Type': 'Paper', 'Title': 'Omnibind: Teach to build unequal-scale modality interaction for omni-bind of all', 'Authors': 'Yuanhuiyi Lyu, Xu Zheng, Dahun Kim, Lin Wang', 'Year': '2024'}]

Research Experience

[{'Company': 'Google DeepMind (formerly Google Brain)', 'Position': 'Senior Research Scientist / Research Scientist', 'Period': 'July 2022 - Present', 'Collaborators': ''}, {'Company': 'Google Research', 'Position': 'Research Intern', 'Period': 'May 2021 - January 2022', 'Collaborators': 'Liang-Chieh Chen, Jun Xie'}, {'Company': 'Google Brain', 'Position': 'Research Intern', 'Period': 'June 2020 - November 2020', 'Collaborators': 'Weicheng Kuo, Tsung-Yi Lin, Anelia Angelova'}, {'Company': 'Adobe Research', 'Position': 'Research Intern', 'Period': 'June 2019 - September 2019', 'Collaborators': 'Joon-Young Lee'}, {'Institution': 'KAIST', 'Position': 'Research Assistant', 'Lab': 'Robotics and Computer Vision Lab', 'Period': 'March 2016 - February 2022'}]

Background

A Senior Research Scientist at Google DeepMind. His recent research interests focus on improving the capabilities of Large Multimodal Models (e.g., Gemini) and understanding the interaction between vision and language.

Miscellany

Academic activities include serving as an Area Chair in NeurIPS 2025, 2024, 2023, ICML 2025, CVPR 2026, 2025, 2024, 2023; Action Editor of Transactions on Machine Learning Research (TMLR); Outstanding Reviewer in CVPR 2021, ECCV 2020; and reviewer for multiple top conferences and journals.

Co-authors

27 total