Scholar
Yuzhe Liang
Google Scholar ID: deUSxiYAAAAJ
Shanghai Jiao Tong University
Deep learning
Multimodal Learning
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
191
H-index
7
i10-index
5
Publications
14
Co-authors
4
list available
Contact
No contact links provided.
Publications
11 items
V2A-DPO: Omni-Preference Optimization for Video-to-Audio Generation
2026
Cited
0
Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis
2026
Cited
0
MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning
arXiv.org · 2026
Cited
1
DMP-TTS: Disentangled multi-modal Prompting for Controllable Text-to-Speech with Chained Guidance
2025
Cited
0
M3-TTS: Multi-modal DiT Alignment & Mel-latent for Zero-shot High-fidelity Speech Synthesis
2025
Cited
0
InstructAudio: Unified speech and music generation with natural language instruction
2025
Cited
0
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization
2025
Cited
0
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
2025
Cited
0
Load more
Resume (English only)
Co-authors
4 total
Xie Chen
Shanghai Jiao Tong University <- Microsoft <- Cambridge University
Ziyang Ma
Shanghai Jiao Tong University
Zhisheng Zheng
The University of Texas at Austin
Wenxi Chen
Shanghai Jiao Tong University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up