AgoraResearch hub
ExploreLibraryProfile
Account
Alexey Dontsov
Scholar

Alexey Dontsov

Google Scholar ID: 2SK4CMIAAAAJ
HSE, AI Interpretability Lab
unlearningmechanistic interpretation
Google Scholar↗
Citations & Impact
All-time
Citations
24
 
H-index
2
 
i10-index
2
 
Publications
5
 
Co-authors
7
list available
Contact
No contact links provided.
Publications
6 items
Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?
2026
Cited
0
The Rogue Scalpel: Activation Steering Compromises LLM Safety
2025
Cited
0
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
2025
Cited
0
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
2025
Cited
0
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
2025
Cited
0
CLEAR: Character Unlearning in Textual and Visual Modalities
arXiv.org · 2024
Cited
0
Resume (English only)
Co-authors
7 total
Elena Tutubalina
Elena Tutubalina
KFU
Ivan Oseledets
Ivan Oseledets
AIRI; Skolkovo Institute of Science and Technology
Oleg Y. Rogov
Oleg Y. Rogov
University of Sharjah, MTUCI
Dmitrii Korzh
Dmitrii Korzh
MTUCI
Anton Razzhigaev
Anton Razzhigaev
Independent researcher
Andrey Galichin
Andrey Galichin
RSI Lab
Anton Korznikov
Anton Korznikov
Independent researcher

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?