Scholar

Jixun Yao (姚继珣)

Google Scholar ID: KjcXd6cAAAAJ

Northwestern Polytechnical University

Voice ConversionSpeech Synthesis

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

380

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailyaojx@mail.nwpu.edu.cn GitHubOpen ↗

Publications

20 items

Browse publications on Google Scholar (top-right) ↗

Resume (English only)

Academic Achievements

- Publications: More than 20 papers in top international speech conferences and journals
- Example Papers:
* Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech (ICLR 2025)
* GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling (AAAI 2025)
* StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching (AAAI 2025)
* Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation (ICASSP 2025)
* DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification (ICASSP 2025)
* Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix (IEEE TASLP 2024)
* PromptVC: Flexible stylistic voice conversion in latent space driven by natural language prompts (ICASSP 2024)
* Dualvc 2: Dynamic masked convolution for unified streaming and non-streaming voice conversion (ICASSP 2024)
* GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition (ICASSP 2024)
* DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion (INTERSPEECH 2024)
* NPU-NTU System for Voice Privacy 2024 Challenge (VPC 2024)
* NTU-NPU System for Voice Privacy 2024 Challenge (VPC 2024)
* The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings (ISCSLP 2024)
* The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge (ISCSLP 2024)
* Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models
* Preserving background sound in noise-robust voice conversion via multi-task learning (ICASSP 2023)
* Distinguishable speaker anonymization based on formant and fundamental frequency scaling (ICASSP 2023)
* Expressive-vc: Highly expressive voice conversion with attention fusion of bottleneck and perturbation features (ICASSP 2023)

Research Experience

- 2024.03 - 2025.02: Nanyang Technological University, Singapore (supervised by Prof. Eng-Siong Chng)
- 2022.12 - 2024.02: Everest Team - Ximalaya, China

Education

- Degree: Ph.D.
- University: Northwestern Polytechnical University
- Supervisor: Prof. Lei Xie
- Time: Ongoing
- Major: Audio, Speech, and Language Processing

Background

- Research Interests: Speech synthesis, voice conversion, and speaker anonymization
- Professional Field: Speech processing, large language models
- Brief Introduction: A fourth-year Ph.D. student at the School of Computer Science, Northwestern Polytechnical University, supervised by Prof. Lei Xie.

Co-authors

7 total