Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
- Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort, preprint, 2025
- Refusal Direction is Universal Across Safety-Aligned Languages, NeurIPS, 2025
- Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation, ICLR, 2025
- Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think, COLM, 2024
- "My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models, ACL Findings, 2024
- ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation, EMNLP, 2023
- How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives, ACL, 2023
- Sceneformer: Indoor Scene Generation with Transformers, 3DV, 2021
Research Experience
- PhD Student, MaiNLP Lab, LMU Munich
- Visiting Researcher, New York University
- Student Researcher, TUM
Education
- PhD, LMU Munich, Supervisor: Prof. Barbara Plank
- M.Sc., Robotics, Cognition, Intelligence, Technical University of Munich (TUM), Research: Indoor scene synthesis
Background
Research Interests: Alignment. Currently a PhD student at the MaiNLP lab at LMU Munich, supervised by Prof. Barbara Plank. Also a visiting researcher at New York University, advised by Prof. He He.