Published several papers such as SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages (EMNLP 2024); Multilingual Large Language Models Are Not (Yet) Code-Switchers (EMNLP 2023); Improving Large-scale Language Models and Resources for Filipino (LREC 2022). Also involved in projects like FilBench, MoMentS, and SEA-VL.
Research Experience
Lead Research Engineer at Samsung Research in the Philippines, working on low-resource machine translation and dialogue generation. Also affiliated with the University of the Philippines, De La Salle University, and Senti AI.
Education
PhD student at MBZUAI, supervised by Dr. Alham Fikri Aji and Prof. Thamar Solorio.
Background
Research interests include multilinguality and low-resource languages. Particularly interested in understanding the behavior of models under low-resource multilingual domains. Research areas cover code switching, resources & evaluation, and applications in low-resource settings.
Miscellany
Personal website includes a blog and photography portfolio.