Scholar

Seonglae Cho

Google Scholar ID: XIMB1PoAAAAJ

University College London

Mechanistic InterpretabilityLanguage ModelingAI Alignment

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

6 items

Tool Calling is Linearly Readable and Steerable in Language Models

2026

Cited

Control Reinforcement Learning: Interpretable Token-Level Steering of LLMs via Sparse Autoencoder Features

2026

Cited

The Confidence Manifold: Geometric Structure of Correctness Representations in Language Models

2026

Cited

CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection

2025

Cited

FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies

2025

Cited

LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries

2025

Cited

Resume (English only)

Academic Achievements

Has multiple open-source projects including CorrSteer, intuiter, etc.

Background

Knowledge absorber, believes creativity is all one needs.

Miscellany

Based in London; interests likely include software development, AI, among others.

Co-authors

3 total

Zekun Wu

Research Scientist, Holistic AI / PhD Student, University College London

Adriano Soares Koshiyama

CEO at Holistic AI, Honorary Research Fellow in Computer Science at UCL

Dongha Lee

Yonsei University