- Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic Representations (ICASSP, 2024)
- EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning (ICASSP, 2024)
Education
PhD student at the Language Technologies Institute, Carnegie Mellon University. Co-advised by Professor Carlos Busso and Professor Shinji Watanabe.
Background
Research Interests: Developing artificial intelligence that understands and interacts with the world in a human-like manner by integrating multiple modalities — particularly by bridging linguistic and visual knowledge with auditory information.
Miscellany
Feel free to contact me through jaeyeon2@andrew.cmu.edu!