Scholar

Roger Creus Castanyer

Google Scholar ID: E3y_txsAAAAJ

Mila/University of Montreal

Reinforcement LearningFoundation Models

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗

Publications

8 items

Breaking the Solver Bottleneck: Training Task Generators at the Learnable Frontier

2026

Cited

unix-ctf: Procedural Environments for Unix-Competence Reinforcement Learning

2026

Cited

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-Play

2026

Cited

Agentick: A Unified Benchmark for General Sequential Decision-Making Agents

2026

Cited

Align and Filter: Improving Performance in Asynchronous On-Policy RL

2026

Cited

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

2025

Cited

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

2025

Cited

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning (preprint, October 2025); Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning (NeurIPS 2025, Spotlight, top 3% submissions); AI Research Scholarship (February 2025); Academic Excellence Scholarship (December 2024); Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning (RLC 2024); RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning (TMLR 2024); Improving Intrinsic Exploration by Creating Stationary Objectives (ICLR 2024); PixelEDL: Unsupervised Skill Discovery and Learning from Pixels (Embodied AI workshop @ CVPR 2021); Unsupervised Skill-Discovery and Skill-Learning in Minecraft (Unsupervised Reinforcement Learning workshop @ ICML 2021); PiCoEDL: Discovery and Learning of Minecraft Navigation Goals from Pixels and Coordinates (Embodied AI workshop @ CVPR 2021); Integration of Convolutional Neural Networks in Mobile Applications (Workshop on AI Engineering @ ICSE 2021); Which Design Decisions in AI-enabled Mobile Applications Contribute to Greener AI? (Empiricial Software Engineering Journal 2022); Enhancing sequence-to-sequence modelling for RDF triples to natural text (WebNLG workshop).

Research Experience

Research Intern @ Ubisoft LaForge (Montreal, Canada); Teaching Assistant @ University of Montreal (Montreal, Canada); Junior Data Scientist @ HP Inc (Barcelona, Spain); Research Assistant @ UPC (Barcelona, Spain); Basketball Coach @ Sagrada Familia Claror (Barcelona, Spain).

Education

PhD: Mila / UdeM (Fall 2024 onwards), Supervisors: Pablo Samuel Castro and Glen Berseth; MSc: Mila Québec & University of Montréal (Montreal, Canada); BSc: Universitat Politècnica de Catalunya (UPC) (Barcelona, Spain).

Background

Research Interests: Deep Reinforcement Learning, Foundation Models (LLMs, VLMs), AI Agents. Focus: Building general, autonomous agents that integrate the structured learning and adaptability of RL with the broad priors and reasoning abilities of foundation models, improving exploration, credit assignment, and skill discovery.

Miscellany

From Barcelona, Spain, currently in Montreal, Canada.

Co-authors

0 total

Co-authors: 0 (list not available)