Seungheon Doh
Scholar

Seungheon Doh

Google Scholar ID: MCkggcgAAAAJ
talkpl.ai
Music Information RetrievalLanguage Models
Citations & Impact
All-time
Citations
521
 
H-index
9
 
i10-index
9
 
Publications
20
 
Co-authors
13
list available
Resume (English only)
Academic Achievements
  • 1. 'Can Large Language Models Predict Audio Effects Parameters from Natural Language?' Submitted to WASPAA 2025.
  • 2. 'TALKPLAY: Multimodal Music Recommendation with Large Language Models' ArXiv 2025.
  • 3. 'CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages' ArXiv 2025.
  • 4. 'Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Model' Proceedings of ISMIR 2024.
  • 5. 'Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval' Proceedings of ICASSP 2024.
  • 6. 'LP-MusicCaps: LLM-based Pseudo Music Captioning' Proceedings of ISMIR 2023.
Research Experience
  • 1. Research Intern in Music Foundation Model Team at Sony AI (Advisors: Junghyun Koo, Marco A. Martínez-Ramírez, Wei-Hsiang Liao), Tokyo, Japan, April 2025 - Present.
  • 2. Research Intern in Music Generation AI Team at Adobe Research (Advisors: Nicholas J. Bryan, Ge Zhu), San Francisco, CA, United States, June 2024 - August 2024.
  • 3. Research Intern in Audio Analysis Team at Chartmetric (Advisor: Keunwoo Choi), Remote, December 2023 - February 2023.
  • 4. Research Intern in Now AI Team at NaverCorp (Advisor: Jeong Choi), South Korea, December 2022 - February 2023.
Education
  • Completed Ph.D. journey, advised by Prof. Juhan Nam.
Background
  • Postdoctoral researcher at the Music and Audio Computing Lab, focusing on developing music intelligence for understanding, retrieval, generation, and post-production tasks. Research directions include: 1. Representation learning methods that establish semantic correspondences between music and other modalities (e.g., natural language, visual content). 2. Exploration of multi-modal large language models (MLLMs) for music applications, with a focus on reasoning, chain-of-thought processes, and tool calling. 3. Development of conversational interfaces for music applications, emphasizing user experience and practical value in real-world scenarios.
Miscellany
  • Plan to enter the job market at March 2026.