Paper 'Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback' accepted at COLM 2025
Two papers accepted at RLC 2025: 'Recursive Reward Aggregation' and 'Offline Reinforcement Learning with Domain-Unlabeled Data'
Paper 'Offline Reinforcement Learning from Datasets with Structured Non-Stationarity' accepted at RLC 2024
Published 'Unsupervised Task Clustering for Multi-Task Reinforcement Learning' at ECML-PKDD 2021
Contributed to multiple RL research directions including task representation, reward aggregation beyond discounted sum, and handling non-stationary datasets