Junxiao Yang
Scholar

Junxiao Yang

Google Scholar ID: 8Zu6HocAAAAJ
Tsinghua University
NLPAI SafetyTrustworthy AI
Citations & Impact
All-time
Citations
278
 
H-index
6
 
i10-index
3
 
Publications
11
 
Co-authors
8
list available
Resume (English only)
Academic Achievements
  • Paper accepted at ACL 2025 Main Conference: 'Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints' (first author)
  • ACL 2024 Long Paper: 'Defending large language models against jailbreaking attacks through goal prioritization' (co-first author)
  • Multiple preprints on LLM safety, knowledge boundaries, fine-tuning data leakage, and safety evaluation frameworks
  • Developed AISafetyLab: a comprehensive framework for AI safety evaluation and improvement
  • Excellent Graduate, Tsinghua University, 2025
  • 3rd Prize, Global Challenge for Safe and Secure LLMs (Track 1)
  • Academic Excellence in Research Award, Tsinghua University (2023.09–2024.09)
  • Meritorious Winner, Mathematical Contest In Modeling, 2023
  • Comprehensive Scholarship, Tsinghua University (2021–2023, two consecutive years)