Published several papers such as 'Dynamic Risk Assessments for Offensive Cybersecurity Agents', 'Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications', and 'Evaluating Copyright Takedown Methods for Language Models'. One of the papers was awarded the Best Paper at SoLaR @ NeurIPS 2024.
Research Experience
Engaged in multiple research projects during his PhD studies at Princeton University.
Education
PhD Student in Electrical and Computer Engineering at Princeton University, advised by Prof. Peter Henderson; Undergraduate from the University of Science and Technology of China (USTC).
Background
Broadly interested in alignment and other safety-related topics. Open to collaboration or discussion on these topics.