- 'CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks', ICLR 2023 (Spotlight)
- Plus several other papers on interpretability.
Research Experience
PhD research under the guidance of Prof. Tsui-Wei (Lily) Weng at UC San Diego.
Education
PhD student at UC San Diego, advised by Prof. Tsui-Wei (Lily) Weng; Bachelor of Science in Computer Science and Engineering and in Philosophy from MIT.
Background
Research Interests: Developing scalable ways to understand deep learning, especially using (mechanistic) interpretability to help improve safety and reliability of neural networks. Current interests include automated interpretability, rigorous interpretability evals, concept bottleneck models (CBMs), and sparse autoencoders (SAEs).
Miscellany
Contact: toikarinen@ucsd.edu; Links to Google Scholar and Github are provided on the personal homepage.