Xiaodong Cui
Scholar

Xiaodong Cui

Google Scholar ID: wzNVJQsAAAAJ
Principal Research Scientist, IBM T. J. Watson Research Center
automatic speech recognitiondeep learningsignal processingpattern recognition
Citations & Impact
All-time
Citations
3,154
 
H-index
24
 
i10-index
43
 
Publications
20
 
Co-authors
51
list available
Publications
20 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • - Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis (ICLR 2025)
  • - M2 ASR: Multilingual Multi-task Automatic Speech Recognition via Multi-objective Optimization (INTERSPEECH 2024)
  • - How Do Nonlinear Transformers Learn and Generalize in In-Context Learning? (ICML 2024)
  • - How Do Nonlinear Transformers Acquire Generalization-Guaranteed CoT Ability? (ICML 2024)
  • - How Can Personalized Context Help? Exploring Joint Retrieval of Passage and Personalized Context (ICASSP 2024)
  • - Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization (ICASSP 2024)
  • - Bilevel Joint Unsupervised and Supervised Training for Automatic Speech Recognition (IEEE/ACM TASLP 2024)
  • - Improving RNN Transducer Acoustic Models for English Conversational Speech Recognition (INTERSPEECH 2023)
  • - Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data (ICML 2023)
  • - Patents: Faithful And Efficient Sample-based Model Explanations (US 12423614, 22 Sep 2025)
  • - Patents: Transformer-based Encoding Incorporating Metadata (JP 7730241, 18 Aug 2025)
  • - Patents: Counterfactual Neural Network Learning For Contextual Enhanced Earnings Call Analysis (US 12361492, 14 Jul 2025)
  • - Patents: Media Capture Device With Power Saving And Encryption (JP 7710816, 10 Jul 2025)
Research Experience
  • - Granite speech: IBM open-source speech-aware large language models
  • - High performance deep neural network acoustic modeling
  • - Distributed acoustic modeling for automatic speech recognition
  • - Fast and accurate speech recognition for customer care
  • - E2E Learning for Speech Recognition and Synthesis
  • - Unsupervised Learning Techniques for Large Unlabeled Data for Speech
  • - IARPA Babel project for spoken term detection
  • - DARPA Transtac project for multi-lingual speech-to-speech translation
  • - Adjunct Professor, Department of Electrical Engineering, Columbia University
Education
  • I received my B.S. degree from Shanghai Jiao Tong University, Shanghai, China, M.S. degree from Tsinghua University, Beijing, China, and Ph.D. degree from the University of California, Los Angeles, all in electrical engineering.
Background
  • I am a principal research scientist in the speech department of the IBM T. J. Watson Research Center. My research interests include automatic speech recognition, multi-lingual speech-to-speech translation, digital speech processing, statistical signal processing, machine learning, and pattern recognition. Most recently, I have been working on deep learning in speech applications.