Zeyu Jin
Scholar

Zeyu Jin

Google Scholar ID: R-PFLHMAAAAJ
Adobe Research
Speech and audio processingDeep Learning
Citations & Impact
All-time
Citations
2,164
 
H-index
22
 
i10-index
36
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation (CHI 2025)
  • Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs (ICLR 2025)
  • Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation (Interspeech 2024)
  • GR0: Self-Supervised Global Representation Learning for Zero-Shot Voice Conversion (ICASSP 2024)
  • MDX-GAN: Enhancing Perceptual Quality in Multi-Class Source Separation Via Adversarial Training (ICASSP 2024)
  • Efficient Spoken Language Recognition Via Multilabel Classification (Interspeech 2023)
  • Audio Similarity is Unreliable as a Proxy for Audio Quality (Interspeech 2022)
  • HEAR: Holistic Evaluation of Audio Representations (NeurIPS 2021)
  • Controllable Speech Representation Learning via Voice Conversion and AIC Loss (ICASSP 2022)
  • SQAPP: No-Reference Speech Quality Assessment Via Pairwise Preference (ICASSP 2022)
  • Music Enhancement via Image Translation and Vocoding (ICASSP 2022)
  • Controllable deep melody generation via hierarchical music representation (International Society for Music Information Retrieval Conference 2021)
  • HiFi-GAN-2: Studio-quality speech enhancement via generative adversarial networks conditioned on acoustic features (IEEE Workshop)
Research Experience
  • Interned at Adobe three times between 2015 and 2017, and presented his primary research project – VoCo – at Adobe MAX Sneaks in 2016.
Education
  • Ph.D. in Computer Science from Princeton University, advised by Adam Finkelstein; M.S. in Music Technology from Carnegie Mellon University.
Background
  • Research interests: Deep generative models for speech, including studio-quality speech enhancement, speech quality assessment, and personalized voice generation. Also interested in HCI for audio applications and music generation.
Co-authors
0 total
Co-authors: 0 (list not available)