Jixun Yao (姚继珣)
Scholar

Jixun Yao (姚继珣)

Google Scholar ID: KjcXd6cAAAAJ
Northwestern Polytechnical University
Voice ConversionSpeech Synthesis
Citations & Impact
All-time
Citations
380
 
H-index
13
 
i10-index
18
 
Publications
20
 
Co-authors
7
list available
Publications
20 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • - Publications: More than 20 papers in top international speech conferences and journals
  • - Example Papers:
  • * Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech (ICLR 2025)
  • * GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling (AAAI 2025)
  • * StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching (AAAI 2025)
  • * Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation (ICASSP 2025)
  • * DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification (ICASSP 2025)
  • * Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix (IEEE TASLP 2024)
  • * PromptVC: Flexible stylistic voice conversion in latent space driven by natural language prompts (ICASSP 2024)
  • * Dualvc 2: Dynamic masked convolution for unified streaming and non-streaming voice conversion (ICASSP 2024)
  • * GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition (ICASSP 2024)
  • * DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion (INTERSPEECH 2024)
  • * NPU-NTU System for Voice Privacy 2024 Challenge (VPC 2024)
  • * NTU-NPU System for Voice Privacy 2024 Challenge (VPC 2024)
  • * The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings (ISCSLP 2024)
  • * The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge (ISCSLP 2024)
  • * Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models
  • * Preserving background sound in noise-robust voice conversion via multi-task learning (ICASSP 2023)
  • * Distinguishable speaker anonymization based on formant and fundamental frequency scaling (ICASSP 2023)
  • * Expressive-vc: Highly expressive voice conversion with attention fusion of bottleneck and perturbation features (ICASSP 2023)
Research Experience
  • - 2024.03 - 2025.02: Nanyang Technological University, Singapore (supervised by Prof. Eng-Siong Chng)
  • - 2022.12 - 2024.02: Everest Team - Ximalaya, China
Education
  • - Degree: Ph.D.
  • - University: Northwestern Polytechnical University
  • - Supervisor: Prof. Lei Xie
  • - Time: Ongoing
  • - Major: Audio, Speech, and Language Processing
Background
  • - Research Interests: Speech synthesis, voice conversion, and speaker anonymization
  • - Professional Field: Speech processing, large language models
  • - Brief Introduction: A fourth-year Ph.D. student at the School of Computer Science, Northwestern Polytechnical University, supervised by Prof. Lei Xie.