Scholar
Yihao Zhang
Google Scholar ID: 9lALkz8AAAAJ
Peking University
AI Safety
Formal Method
Mechanistic Interpretability
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
139
H-index
7
i10-index
6
Publications
14
Co-authors
7
list available
Contact
Email
zhangyihao@stu.pku.edu.cn
GitHub
Open ↗
Publications
15 items
ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems
2026
Cited
0
RACA: Representation-Aware Coverage Criteria for LLM Safety Testing
2026
Cited
0
CBA: Communication-Bound-Aware Cross-Domain Resource Assignment for Pipeline-Parallel Distributed LLM Training in Dynamic Multi-DC Optical Networks
2025
Cited
0
FAIRY2I: Universal Extremely-Low Bit QAT framework via Widely-Linear Representation and Phase-Aware Quantization
2025
Cited
0
Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
2025
Cited
0
Automata-Based Steering of Large Language Models for Diverse Structured Generation
2025
Cited
0
Why does weak-OOD help? A Further Step Towards Understanding Jailbreaking VLMs
2025
Cited
0
Fairy$pm i$: the First 2-bit Complex LLM with All Parameters in ${pm1, pm i}$
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
Paper 'Boosting jailbreak attack with momentum' accepted as Oral at ICASSP 2025
Two papers accepted at ICASSP 2024 (first author and second-to-last author, respectively)
Paper 'Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models' accepted at NeurIPS 2024
One paper accepted at SETTA 2024 (second author)
Paper 'On the Duality Between Sharpness-Aware Minimization and Adversarial Training' accepted at ICML 2024
Two papers accepted at ICLR 2024 R2-FM Workshop (first author and second-to-last author, respectively)
Paper 'MedTiny: Enhanced Mediator Modeling Language for Scalable Parallel Algorithms' accepted at QRS-C 2023
Paper 'Sharpness-Aware Minimization Alone can Improve Adversarial Robustness' accepted at AdvML-Frontiers@ICML 2023
Undergraduate thesis 'Automata Extraction from Transformers' posted on arXiv
Awarded the Beijing Natural Science Foundation Undergraduate 'Initiating Research' Program (2023)
Background
First-year PhD student in Applied Mathematics at the School of Mathematical Sciences, Peking University
Research interests include: Safety, Interpretability, and Social Value of LLM-based Agents (current focus)
Mechanistic Interpretability for Large Language Models (current focus)
Causality in AI, Formalization and Verification of Causality-Related Issues (current focus)
Large Language Model Alignment and Trustworthy LLMs
Representation Engineering in LLMs
AI Safety, verification of robustness/fairness/trustworthiness in AI systems
Automated Interactive Theorem Proving (AI4ITP)
Formal Methods, Model Checking, Software Analysis, Program Verification
Formalizing and verifying quantum computation and quantum AI systems
Testing technologies for AI systems
Co-authors
7 total
Zeming Wei
Ph.D. Candidate, Peking University
Meng Sun
Professor, School of Mathematical Science, Peking University
Xiyue Zhang
University of Bristol
Sun Jun
Professor of SCIS, SMU
Hangzhou He
PhD student, Peking University
Yifei Wang
MIT
Huanran Chen
PhD student, Tsinghua SAIL
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up