2025: Granted a USPTO patent on 'Next-generation Smart Agriculture Networks'.
2025: Paper accepted as Spotlight at NeurIPS 2025; additional papers accepted at ACL 2025, ACM SIGMETRICS 2025 Workshop, ICLR 2025 Workshop, IEEE ICC 2025, AAMAS 2025, AAAI 2025 (including Innovative Applications track), etc.
2024: Two papers accepted at NeurIPS 2024 (acceptance rate: 25.8%); one at AAAI 2025 (23.4%); one at AAMAS 2025 (24.5%); publications in IEEE Transactions on Wireless Communications, IEEE/ACM Transactions on Networking, AIoT 2024, RLC 2024 Workshop, etc.
2024: Selected as Top Reviewer (top 8%) at NeurIPS 2024.
2024: Awarded travel grant by Meta for the 1st RL Conference; received Excellence in Research Award from the Data Science Program at Stony Brook University.
Serving as PC member for multiple top-tier conferences: KDD 2025, IJCAI 2025, ICML 2025, AISTATS 2025, ICLR 2025, COLING 2025, AAAI 2024, etc.
2025: Delivered invited plenary talk at NSF AI Institute for Societal Decision Making; invited talk at SIAM Conference on Financial Mathematics & Engineering (FM25).
Background
Postdoctoral fellow in the Department of Computer Science at Harvard University, affiliated with Teamcore, hosted by Prof. Milind Tambe.
Motivated by complex, resource-constrained sequential decision-making problems under uncertainty.
Research objective is to advance reinforcement learning (RL) through innovative structured RL frameworks that leverage inherent problem structures to improve sample efficiency and accelerate learning.
Focuses on addressing key challenges in both model-based and model-free RL, especially in environments with multiple coupled Markov Decision Processes (MDPs).
Aims to develop RL algorithms with provably sub-linear regret guarantees.
Seeks real-world impact and scalability of structured RL in applications such as edge/cloud computing, cloud caching, wireless video streaming, and healthcare.
Primary research interests include: online sequential decision-making under uncertainty, stochastic optimization and control, finite-time convergence and regret analysis for RL, decentralized optimization, multi-agent RL for networked decision-intelligent systems, and broader societal impacts in public health and social good.