Shresth Verma
Scholar

Shresth Verma

Google Scholar ID: XZQSq2sAAAAJ
Harvard University
AI for Public HealthPlanning and OptimizationMultiagent Systems
Citations & Impact
All-time
Citations
265
 
H-index
9
 
i10-index
8
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Publications:
  • - Preference-Robust DPO (AAAI 2026)
  • - Portfolios for Multi-objective RL (ICML 2025)
  • - LLMs for RL Code Generation (AAAI 2025)
  • - Social Choice Language Model (NeurIPS 2024 GenAI for Health Workshop)
  • - Group Fairness in Decision-Focused Learning (UAI 2024)
  • - PRIORITY2REWARD: Incorporating Healthworker Preferences for Resource Allocation Planning (AAAI 2025)
  • - Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning (ICML 2025)
  • - Improving Health Information Access in the World’s Largest Maternal Mobile Health Program via Bandit Algorithms (IAAI 2024)
  • - Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare (IJCAI 2023)
  • - Scalable decision-focused learning in restless multi-armed bandits with application to maternal and child health (AAAI 2023)
  • - Robust planning over restless groups: engagement interventions for a large-scale maternal telehealth program (AAAI 2023)
Research Experience
  • 2021 - 2023: Google Research India, AI for Social Good lab, advised by Dr. Aparna Taneja, developed and deployed robust bandit algorithms for targeted mobile health interventions for over 100K beneficiaries from underserved communities in India. Before that, worked as a Data Scientist at United Health Group, modeling readmission risks for millions of beneficiaries and designing graph-based analytics and tools using the world's largest healthcare graph database.
Education
  • 2021 - Present: Harvard University, PhD student, advisor: Prof. Milind Tambe.
Background
  • Third-year PhD student at Harvard University, advised by Prof. Milind Tambe. Research interests include reinforcement learning and LLMs for complex decision-making tasks. Work focuses on enhancing code generation by LLMs through inference-time reasoning, developing robust LLM fine-tuning techniques, and safe mechanisms for balancing tradeoffs in multi-objective preferences.
Miscellany
  • Personal interests not mentioned.
Co-authors
0 total
Co-authors: 0 (list not available)