Scholar

Shuai Wang

Google Scholar ID: us4prRUAAAAJ

Nanjing University

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

No contact links provided.

Publications

27 items

SLT 2026 REAL-TSE Challenge: Real-world Target Speaker Extraction from Conversational Recordings

2026

Cited

BeatEdit: Symbolic Music Generation as Explicit Editing

2026

Cited

TokAN: Accent Normalization Using Self-Supervised Speech Tokens

2026

Cited

MIMFlow: Integrating Masked Image Modeling with Normalizing Flows for End-to-End Image Generation

2026

Cited

UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer

2026

Cited

Geometrically Constrained Decentralized Independent Vector Analysis for Distributed Microphone Arrays

2026

Cited

Foley-Omni: A Unified Multimodal Generation Model from Task-Level Audio Synthesis to Complete Video Soundtrack Generation

2026

Cited

SpeakerCard-1M: An Evidence-Grounded Speaker Card Corpus for In-the-Wild Speaker Verification

2026

Cited

Resume (English only)

Academic Achievements

Published 8 papers (on arXiv), developed 3 models (FlowDCN, NeuralSolverDistillation-SDXL, NeuralSolverDistillation-SD1.5), and maintained several datasets.

Research Experience

Involved in several research projects including but not limited to UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions; Emu3.5: Native Multimodal Models are World Learners.

Background

AI & ML interests

Miscellany

Active on the Hugging Face platform, following and commenting on the latest research developments in relevant fields.

Co-authors

4 total

Limin Wang

Nanjing University

Yao Teng

The University of Hong Kong, Nanjing University

Zexian Li

Alibaba

Ziteng Gao

National University of Singapore