Scholar

Xiaobin Zhuang

Google Scholar ID: a-crUqgAAAAJ

Bytedance

Audio Generation

Google Scholar↗

Citations & Impact

All-time

Citations

310

H-index

8

i10-index

7

Publications

16

Co-authors

1

list available

Contact

No contact links provided.

Publications

8 items

DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation

2025

Cited

0

Heptapod: Language Modeling on Visual Signals

2025

Cited

0

Sounding that Object: Interactive Object-Aware Image to Audio Generation

2025

Cited

0

MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation

2025

Cited

0

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

2025

Cited

0

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

2025

Cited

0

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

2025

Cited

0

Sound-VECaps: Improving Audio Generation with Visual Enhanced Captions

2024

Cited

0

Resume (English only)

Co-authors

1 total

ByteDance, Seed