Scholar

Shresth Verma

Google Scholar ID: XZQSq2sAAAAJ

Harvard University

AI for Public HealthPlanning and OptimizationMultiagent Systems

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

265

H-index

i10-index

Publications

Co-authors

Contact

Emailsvverma at g dot harvard dot edu CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

Bilevel Optimization of Synthetic Trajectories for Multi-Turn LLM Fine-Tuning

2026

Cited

Decisions and Deployment: The Five-Year SAHELI Project (2020-2025) on Restless Multi-Armed Bandits for Improving Maternal and Child Health

2026

Cited

Lightweight Robust Direct Preference Optimization

2025

Cited

Preference Robustness for DPO with Applications to Public Health

2025

Cited

Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning

2025

Cited

Measuring Fairness in Financial Transaction Machine Learning Models

2025

Cited

Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Publications:
- Preference-Robust DPO (AAAI 2026)
- Portfolios for Multi-objective RL (ICML 2025)
- LLMs for RL Code Generation (AAAI 2025)
- Social Choice Language Model (NeurIPS 2024 GenAI for Health Workshop)
- Group Fairness in Decision-Focused Learning (UAI 2024)
- PRIORITY2REWARD: Incorporating Healthworker Preferences for Resource Allocation Planning (AAAI 2025)
- Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning (ICML 2025)
- Improving Health Information Access in the World’s Largest Maternal Mobile Health Program via Bandit Algorithms (IAAI 2024)
- Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare (IJCAI 2023)
- Scalable decision-focused learning in restless multi-armed bandits with application to maternal and child health (AAAI 2023)
- Robust planning over restless groups: engagement interventions for a large-scale maternal telehealth program (AAAI 2023)

Research Experience

2021 - 2023: Google Research India, AI for Social Good lab, advised by Dr. Aparna Taneja, developed and deployed robust bandit algorithms for targeted mobile health interventions for over 100K beneficiaries from underserved communities in India. Before that, worked as a Data Scientist at United Health Group, modeling readmission risks for millions of beneficiaries and designing graph-based analytics and tools using the world's largest healthcare graph database.

Education

2021 - Present: Harvard University, PhD student, advisor: Prof. Milind Tambe.

Background

Third-year PhD student at Harvard University, advised by Prof. Milind Tambe. Research interests include reinforcement learning and LLMs for complex decision-making tasks. Work focuses on enhancing code generation by LLMs through inference-time reasoning, developing robust LLM fine-tuning techniques, and safe mechanisms for balancing tradeoffs in multi-objective preferences.

Miscellany

Personal interests not mentioned.

Co-authors

0 total

Co-authors: 0 (list not available)