- Paper: J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
- Paper: MENLO: From Preferences to Proficiency – Evaluating and Modeling Native-like Quality Across 47 Languages
- Papers: What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations and Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models accepted at ACL 2025
- Paper: PRobELM: Plausibility Ranking Evaluation for Language Models accepted at COLM 2024
- Paper: Low-Rank Adaptation for Multilingual Summarisation: An Empirical Study accepted in the findings of NAACL 2024
Research Experience
- Research Scientist at Meta, working closely with Jason Weston in the FAIR Alignment team on projects involving reinforcement learning, LLM-as-a-judge, and reward modeling
- Visiting researcher at the University of Cambridge, previously a postdoctoral research associate collaborating with Prof. Andreas Vlachos on factuality in NLP
- Internship at Google DeepMind, working on multilingual summarization
Education
- PhD in Knowledge-Grounded NLP from City, University of London
- Master’s degree in Electrical Engineering from the University of Erlangen-Nürnberg and University College London
Background
Research interests include large-scale reasoning models, post-training and reinforcement learning, LLM-as-a-judge, and generative reward modeling. Focusing on fundamental AI research, particularly in Large Language Models (LLMs).
Miscellany
Actively exploring Senior Research Scientist roles in industry.