Scholar
Seonglae Cho
Google Scholar ID: XIMB1PoAAAAJ
University College London
Mechanistic Interpretability
Language Modeling
AI Alignment
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
2
H-index
1
i10-index
0
Publications
4
Co-authors
3
list available
Contact
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
5 items
Control Reinforcement Learning: Interpretable Token-Level Steering of LLMs via Sparse Autoencoder Features
2026
Cited
0
The Confidence Manifold: Geometric Structure of Correctness Representations in Language Models
2026
Cited
0
CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection
2025
Cited
0
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies
2025
Cited
0
LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries
2025
Cited
0
Resume (English only)
Academic Achievements
Has multiple open-source projects including CorrSteer, intuiter, etc.
Background
Knowledge absorber, believes creativity is all one needs.
Miscellany
Based in London; interests likely include software development, AI, among others.
Co-authors
3 total
Zekun Wu
Research Scientist, Holistic AI / PhD Student, University College London
Adriano Soares Koshiyama
CEO at Holistic AI, Honorary Research Fellow in Computer Science at UCL
Dongha Lee
Yonsei University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up