Published multiple papers, including 'MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts' (ArXiv 2024), 'Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models' (TMLR 2025), and more.
Research Experience
Previously worked at Meta SuperIntelligence Labs and Salesforce Research, currently a research scientist at Thinking Machines Lab.
Education
PhD from the Paul G. Allen School of Computer Science & Engineering, University of Washington, advised by Luke Zettlemoyer and Michael D. Ernst, focusing on code generation with neural networks.
Background
Research scientist, passionate about building general intelligent systems that process information at scale and assist humans in various knowledge-intensive tasks.