Papers: 'Towards 131k-Context dLLMs', 'R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training', 'Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition'; ICML 2025 spotlight paper: 'novel in-context learning behaviors'; Released longdllm, a new method to extend dLLMs up to 131k context.
Research Experience
Summer 2025 intern at Microsoft Research Deep Learning Group, mentored by Xiaodong Liu and Lucas Liu; Former full-stack software engineer at Academia.edu and Abbvie Stemcentrx.
Education
PhD: UW-Madison, Department of Computer Science, advisor Fred Sala; Master's: SEAS, Harvard University; Bachelor's: Caltech, Computer Science.
Background
Research Interests: Data-driven methods for compute-efficient learning; Field: Computer Science; Brief Introduction: A PhD student at UW-Madison, advised by Fred Sala.
Miscellany
Personal Projects: MadML, a research reading seminar for students interested in all aspects of frontier machine learning research.