Published multiple academic papers on topics including LLM activation decoding, reducing malicious use with unlearning, feedback loops with language models, and more. Some of his papers have been accepted to ICML 2024, ICML 2023 Oral, ICLR 2022, etc.
Research Experience
Involved in several research projects such as LatentQA, WMDP Benchmark, and more, collaborating with multiple co-authors. Won Best Use of ESRI Technology at the Caltech Hackathon 2020 and Best Social Network Hack at the Stanford Hackathon 2021.
Education
Studied mathematics and computer science at Caltech, advised by Anima Anandkumar and Yuanyuan Shi; pursuing a CS PhD at UC Berkeley, advised by Jacob Steinhardt.
Background
Works on safety at xAI. Currently finishing a CS PhD at Berkeley, with research interests in interpretability and monitoring for AI agents.
Miscellany
Enjoys participating in hackathon projects, having worked with Yongkyun (Daniel) Lee, Evan Yeh, and Terry Kwon.