Scholar
Yusheng Dai
Google Scholar ID: tvjQ7GUAAAAJ
Monash University
Multimodal
Speech Processing
Computer Vison
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
115
H-index
6
i10-index
5
Publications
12
Co-authors
7
list available
Contact
No contact links provided.
Publications
4 items
Omni2Sound: Towards Unified Video-Text-to-Audio Generation
arXiv.org · 2026
Cited
0
ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling
2025
Cited
0
Latent Swap Joint Diffusion for Long-Form Audio Generation
2025
Cited
0
Phoneme-Level Contrastive Learning for User-Defined Keyword Spotting with Flexible Enrollment
2024
Cited
0
Resume (English only)
Co-authors
7 total
Jun Du
Professor, NERC-SLIP, USTC
Chin-Hui Lee
Georgia Tech
Odette Scharenborg
Full Professor, Delft University of Technology, The Netherlands
sabato marco siniscalchi
Unipa, NTNU, GaTech
Shinji Watanabe
Carnegie Mellon University
Jingdong Chen
Northwestern Polytechnical University
James Anderson
Associate Professor, Electrical Engineering, Columbia University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up