Published 'Asymptotically optimal regret in communicating Markov decision processes' in 2025; Completed PhD thesis 'Optimal regrets in Markov decision processes' in 2024 and received the prix de thèse UGA 2025; Published 'Achieving Tractable Minimax Optimal Regret in Average Reward MDPs' at NeurIPS 2024; Published 'The Sliding Regret in Stochastic Bandits: Discriminating Index and Randomized Policies' in 2023.
Research Experience
Currently a Post-Doc at the Institut de Recherche en Informatique de Toulouse (IRIT). Has also been a Post-Doc at Grenoble and visited Urtzi Ayesta's team in Toulouse.
Education
PhD thesis supervised by Bruno Gaujal and Panayotis Mertikopoulos, defended in November 2024. The thesis is entirely dedicated to regret minimization in Markov decision processes in the undiscounted infinite horizon setting.
Background
Research interests include Reinforcement Learning, Multi-Armed Bandits, Game Theory, and Queuing Theory. Aims to explain the results of learning algorithms run on his computer by pushing theory to its limits, focusing on finite Markov decision processes, stochastic bandits, and finite games in normal form.