- Created the first full RLHF pipeline for training multilingual instruct-following LLMs, which includes a dataset of 161,443 messages in 35 different languages, accepted by Neurips D&B 2023.
- Introduced TMMLU+, a new benchmark designed for Traditional Chinese language understanding, accepted by COLM 2024.
- Studied the impact of format restrictions on the performance of large language models, accepted by EMNLP 2024 Industry Track.
- Proposed StreamBench, a novel benchmark to evaluate the continuous improvement of LLM agents, accepted by Neurips D&B 2024.
- Investigated LLMs' ability to proactively ask for user support, especially in text-to-SQL generation, with findings contributing to the field.
Research Experience
Serves as a Research Scientist at Appier AI Research, focusing on exploring Large Language Models (LLMs); prior work experiences span keyword generation, search optimization, AutoML, and Generative Adversarial Networks (GANs).
Background
Research interests include LLM foundations & behavior, multilingual understanding, and applied AI experience. Specializes in artificial intelligence, particularly in the development and optimization of large language models.