Scholar

Sebastian Bordt

Google Scholar ID: 6PnL3BgAAAAJ

University of Tübingen

Machine LearningLanguage ModelsInterpretability

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

426

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailsebastian.bordt@uni-tuebingen.de CVOpen ↗GitHubOpen ↗

Publications

8 items

How to Scale Mixture-of-Experts: From muP to the Maximally Scale-Stable Parameterization

2026

Cited

Using predictive multiplicity to measure individual performance within the AI Act

2026

Cited

Weight Decay Improves Language Model Plasticity

2026

Cited

Train Once, Answer All: Many Pretraining Experiments for the Cost of One

2025

Cited

Informative Post-Hoc Explanations Only Exist for Simple Functions

2025

Cited

On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

2025

Cited

How much can we forget about Data Contamination?

arXiv.org · 2024

Cited

Position: Rethinking Explainable Machine Learning as Applied Statistics

2024

Cited

Resume (English only)

Academic Achievements

NeurIPS 2025 Spotlight paper: 'On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling'
Two papers accepted at ICML 2025: 'How Much Can We Forget about Data Contamination?' and 'Rethinking Explainable Machine Learning as Applied Statistics'
COLM 2024 paper: 'Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models'
NeurIPS 2023 Spotlight paper: 'Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness'
AISTATS 2023 paper: 'From Shapley Values to Generalized Additive Models and back'
Preprint (Sep 2025): 'Train Once, Answer All: Many Pretraining Experiments for the Cost of One'
Preprint (Aug 2025): 'Informative Post-Hoc Explanations Only Exist for Simple Functions'
Invited talk at Banff Workshop 'New Directions in Machine Learning Theory' (Oct 2024)

Background

Postdoctoral researcher in the theory of machine learning group at the University of Tübingen
Research interests include large language models (LLMs) and interpretability
Currently working on a systematic understanding of pre-training
Previously conducted black-box evaluations of LLMs
At Microsoft Research, ran experiments combining GPT-4 and Generalized Additive Models (GAMs) in healthcare settings
During PhD, explored various topics in explainable machine learning, including connections between post-hoc explanation methods and inherently interpretable models, and the regulatory suitability of explanation algorithms

Co-authors

12 total