Scholar

Shiyin Kang

Google Scholar ID: mnCHk8EAAAAJ

Sensetime Inc.

Speech SynthesisVoice ConversionSpeech RecognitionMachine LearningHigh Performance Computing

Google Scholar↗

Citations & Impact

All-time

Citations

2,914

H-index

24

i10-index

43

Publications

20

Co-authors

49

list available

Contact

No contact links provided.

Publications

6 items

Breaking the Quality--Intelligibility Trade-off in Streaming Target Speaker Extraction via Deep-Feature-Anchored Preference Optimization

2026

Cited

0

How Should LLMs Listen While Speaking? A Study of User-Stream Routing in Full-Duplex Spoken Dialogue

2026

Cited

0

Towards Streaming Target Speaker Extraction via Chunk-wise Interleaved Splicing of Autoregressive Language Model

2026

Cited

0

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

2025

Cited

0

AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation

IEEE transactions on multimedia · 2023

Cited

1

Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT

Interspeech · 2019

Cited

24

Resume (English only)

Co-authors

49 total

Zhiyong WU (吴志勇)

Associate Professor, Tsinghua University

Dong Yu (俞栋)

Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA Fellow

SpeechX Limited

Chinese University of Hong Kong

PhD student, Tsinghua University

Meituan multi-modal team, PhD (The Chinese University of Hong Kong)