Published a monograph 'From bandits to Monte-Carlo Tree Search: The optimistic principle applied to optimization and planning' (2014); supervised several PhD students, including Sébastien Bubeck who is now an assistant professor at Princeton, and Odalric-Ambrym Maillard who received the AFIA 2012 best thesis award; organized and chaired several academic conferences such as AAAI 2013 tutorial, ALT 2013 co-chair, etc.
Research Experience
Worked at Microsoft Research New-England from 2013 to 2014; worked with the SequeL team at INRIA Lille - Nord Europe; participated in several European projects and activities such as PASCAL2, COMPLACS, CO-ADAPT, etc.
Background
Currently at Google DeepMind, Senior Researcher. Research interests include: bandit theory, optimistic algorithms (such as KL-UCB, UCB-V), Thompson sampling, foundations of Monte-Carlo Tree Search, optimal control, reinforcement learning, etc.