Chaeyoung Jung
Scholar

Chaeyoung Jung

Google Scholar ID: LooPec8AAAAJ
KAIST
Multi-modal LLMsgenerative modelsaudio-visual processing
Citations & Impact
All-time
Citations
59
 
H-index
5
 
i10-index
3
 
Publications
10
 
Co-authors
3
list available
Resume (English only)
Academic Achievements
  • - Publications:
  • - Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models (Preprint, 2025)
  • - AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding (NIPS, 2025)
  • - InfiniteAudio: Infinite-Length Audio Generation with Consistent Acoustic Attributes (Interspeech, 2025)
  • - SEED: Speaker Embedding Enhancement Diffusion Model (Interspeech, 2025)
  • - From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech (CVPR, 2025)
  • - Voicedit: Dual-condition diffusion transformer for environment-aware speech synthesis (ICASSP, 2025)
  • - FlowAVSE: Efficient audio-visual speech enhancement with conditional flow matching (Interspeech, 2024)
  • - Seeing through the conversation: Audio-visual speech separation based on diffusion model (ICASSP, 2024)
  • - Talknce: Improving active speaker detection with talk-aware contrastive learning (ICASSP, 2024)
Research Experience
  • - Research Projects: Multimodal learning, generative modeling in audio applications
Education
  • - Degree: Ph.D. Student
  • - University: KAIST
  • - Advisor: Professor Joon Son Chung
  • - Time: Started in March 2025
  • - Major: Multimedia and Artificial Intelligence (MMAI)
Background
  • - Research Interests: Multimodal learning, particularly deepening the understanding and reasoning capabilities of multi-modal large language models (MLLMs)
  • - Professional Field: Generative modeling in audio, including text-to-audio generation, speech enhancement, source separation, and lip-to-speech synthesis