Tuomas Oikarinen
Scholar

Tuomas Oikarinen

Google Scholar ID: M3KZnPwAAAAJ
UC San Diego
Machine LearningInterpretabilityMechanistic InterpretabilityExplainable AI
Citations & Impact
All-time
Citations
788
 
H-index
9
 
i10-index
9
 
Publications
20
 
Co-authors
5
list available
Resume (English only)
Academic Achievements
  • Publications:
  • - 'Evaluating Neuron Explanations: A Unified Framework with Sanity Checks', ICML 2025
  • - 'Linear Explanations for Individual Neurons', ICML 2024
  • - 'Label-Free Concept Bottleneck Models', ICLR 2023
  • - 'CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks', ICLR 2023 (Spotlight)
  • - Plus several other papers on interpretability.
Research Experience
  • PhD research under the guidance of Prof. Tsui-Wei (Lily) Weng at UC San Diego.
Education
  • PhD student at UC San Diego, advised by Prof. Tsui-Wei (Lily) Weng; Bachelor of Science in Computer Science and Engineering and in Philosophy from MIT.
Background
  • Research Interests: Developing scalable ways to understand deep learning, especially using (mechanistic) interpretability to help improve safety and reliability of neural networks. Current interests include automated interpretability, rigorous interpretability evals, concept bottleneck models (CBMs), and sparse autoencoders (SAEs).
Miscellany
  • Contact: toikarinen@ucsd.edu; Links to Google Scholar and Github are provided on the personal homepage.