- Sailor2: Multilingual language models supporting 15 languages, including English, Chinese, Burmese, etc.
- Sailor: Open language models for South-East Asian languages, ranging from 0.5B to 14B parameters.
- SailCraft: Data toolkit for Sailor language models.
Research Experience
- Research Scientist at Sea AI Lab
- Research Intern at Microsoft Research Asia with Dr. Jian-Guang Lou
- Research Intern at National University of Singapore with Professor Min-Yen Kan
Education
Ph.D. and Bachelor's degree in Computer Science from Harbin Institute of Technology, advised by Professor Wanxiang Che.
Background
Research Interests: Natural Language Processing, particularly in multilingual LLM pre-training. Worked as a research intern at Microsoft Research Asia and National University of Singapore.
Miscellany
Contact: Email, Google Scholar, LinkedIn, GitHub, Twitter