Scholar
Songjun Cao
Google Scholar ID: 0H6jEP8AAAAJ
Tencent
speech understanding
speech generation
multi-modal
LLM
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
287
H-index
8
i10-index
7
Publications
19
Co-authors
0
Contact
No contact links provided.
Publications
7 items
Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning
2026
Cited
0
Leveraging large multimodal models for audio-video deepfake detection: a pilot study
2026
Cited
0
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data
2025
Cited
0
MPE-TTS: Customized Emotion Zero-Shot Text-To-Speech Using Multi-Modal Prompt
2025
Cited
0
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
2025
Cited
0
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
2025
Cited
0
Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition
2025
Cited
0
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up