Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
- SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation, The Second Conference on Language Modeling (COLM), 2025
- Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling, The 13th International Conference on Learning Representations (ICLR), 2025
- LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models, The 12th International Conference on Learning Representations (ICLR), 2024
- Module-wise Adaptive Distillation for Multimodality Foundation Models, The 37th Conference on Neural Information Processing Systems (NeurIPS), 2023
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression, The 40th International Conference on Machine Learning (ICML), 2023
- HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers, The 11th International Conference on Learning Representations (ICLR), 2023
- PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance, The 39th International Conference on Machine Learning (ICML), 2022
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models, The 10th International Conference on Learning Representations (ICLR), 2022
- CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing, The 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022
- Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach, The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Research Experience
Currently a Senior Researcher at Microsoft, working on the training and adaptation of OpenAI and Microsoft models.
Education
- Ph.D. in Machine Learning, Georgia Tech, School of Industrial & System Engineering, December 2023, Advisor: Prof. Tuo Zhao
- M.S. in Computational Science & Engineering, Georgia Tech, School of Computational Science & Engineering, May 2020
- B.S. in Electrical Engineering, USC, Department of Electrical & Computer Engineering, May 2018
Background
Research interests: Deep learning and natural language processing, with a primary focus on improving the efficiency and generalizability of neural language models. Currently a Senior Researcher at Microsoft, working on the training and adaptation of OpenAI and Microsoft models.