Adrià Garriga-Alonso
Scholar

Adrià Garriga-Alonso

Google Scholar ID: OtnThiMAAAAJ
Research Scientist, FAR AI
AI safetyinterpretability
Citations & Impact
All-time
Citations
3,482
 
H-index
12
 
i10-index
12
 
Publications
20
 
Co-authors
14
list available
Resume (English only)
Academic Achievements
  • Published multiple papers, such as 'Towards Automatic Circuit Discovery for Mechanistic Interpretability' and 'Causal Scrubbing: a Method for Rigorously Testing Interpretability Hypotheses'.
Research Experience
  • Currently a Research Scientist at FAR AI. Previously worked at Redwood Research on interpretability research and software development.
Education
  • Holds a PhD in machine learning from the University of Cambridge, advised by Prof. Carl Rasmussen. His research focused on improving uncertainty quantification in neural networks using Bayesian principles.
Background
  • Research interests include how neural networks work internally, evaluating the accuracy of interpretability explanations, finding algorithmic explanations at lower labor and compute costs, and understanding the behavior and motivations of agent-like AIs. The goal is to ensure that AI is beneficial to society.
Miscellany
  • Personal blog covers topics like remote development, ethics, language (Catalan), contest problem write-ups, and algorithms.