Research Areas: Off-policy Learning, Objective Functions, Empirical Reinforcement Learning. Focuses on how to minimize the objective without access to v_π, how to weight error tolerance in each state, and whether to consider errors under an L2 distance, an L1 distance, or something else.
Background
Research Interests: Improving the stability and usability of off-policy learning algorithms. Aiming to find practical, efficient, and useful algorithms to learn value functions and policies from arbitrary data, which will significantly reduce the cost of deploying RL algorithms in the real world. Being able to effectively measure the quality of policies will allow for better decision-making before deployment.
Miscellany
Contact Information: Office - CSC 2-05; Email - ap3(at)ualberta.ca, andnpatterson(at)gmail.com; Social Media Links include GitHub, Google Scholar, LinkedIn.