Paper on continuous-time policy evaluation accepted to SIAM Journal on Mathematics of Data Science; new paper on RL with action-triggered observations; new paper on RL fine tuning of diffusion models with function approximation; new paper on federated learning with physical communication channels; new paper on statistical guarantees for continuous-time reinforcement learning; new paper on optimal interpolation between bootstrap and rollout methods in reinforcement learning; new paper on debiasing general Z estimators.
Research Experience
Assistant Professor at the Department of Statistical Sciences, University of Toronto.
Education
Ph.D. from the Department of EECS, UC Berkeley, advised by Prof. Martin Wainwright and Prof. Peter Bartlett; B.S. in Computer Science from Peking University, advised by Prof. Liwei Wang.
Background
Research Interests: Mathematics of machine learning in the era of large AI models, including post-training optimization of generative models, reinforcement learning fine-tuning and test-time adaptation, practical structures that enable efficient reinforcement learning with function approximation, RL in continuous-time diffusion processes, stochastic approximation for large-scale machine learning, incorporation of machine learning into causal and semiparametric estimation problems. On the applied side, interested in various applications of machine learning for engineering problems in the physical world.