AgoraResearch hub
ExploreLibraryProfile
Account
David Williams-King
Scholar

David Williams-King

Google Scholar ID: IRiBU0gAAAAJ
Research Scientist, Mila
cybersecurityartificial intelligenceaccessibility
Homepage↗Google Scholar↗
Citations & Impact
All-time
Citations
678
 
H-index
9
 
i10-index
8
 
Publications
20
 
Co-authors
5
list available
Contact
No contact links provided.
Publications
8 items
Behavioural Analysis of Alignment Faking
2026
Cited
0
FragBench: Cross-Session Attacks Hidden in Benign-Looking Fragments
2026
Cited
0
Latent Personality Alignment: Improving Harmlessness Without Mentioning Harms
2026
Cited
0
LLM Wardens: Mitigating Adversarial Persuasion with Third-Party Conversational Oversight
2026
Cited
0
Diagnosing Pathological Chain-of-Thought in Reasoning Models
2026
Cited
0
Representation Engineering for Large-Language Models: Survey and Research Challenges
2025
Cited
0
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
2025
Cited
0
Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
2025
Cited
0
Resume (English only)
Co-authors
5 total
Junfeng Yang
Junfeng Yang
Professor of Computer Science, Columbia University
Vasileios P. Kemerlis
Vasileios P. Kemerlis
Associate Professor, Brown University
Kexin Pei
Kexin Pei
Assistant Professor, Computer Science, University of Chicago
Co-author 4
Co-author 4
Yoshua Bengio
Yoshua Bengio
Professor of computer science, University of Montreal, Mila, IVADO, CIFAR

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?