Scholar

Mehul Damani

Google Scholar ID: PBAmmGMAAAAJ

MIT

Reinforcement LearningMulti-Agent Systems

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,245

H-index

i10-index

Publications

Co-authors

list available

Contact

Emaildamanimehul24@gmail.com CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

Vector Policy Optimization: Training for Diversity Improves Test-Time Search

2026

Cited

Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

2026

Cited

CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control

2026

Cited

Self-Distillation Enables Continual Learning

2026

Cited

Latency and Token-Aware Test-Time Compute

2025

Cited

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

2025

Cited

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

1. Trained reasoning models to reason about their uncertainty using RL, new paper published!
2. Started internship at MIT-IBM Watson Lab to work on RL for tool-use.
3. Paper on test-time training was accepted to ICML!

Research Experience

1. Worked with Lerrel Pinto at NYU on developing automatic curriculum learning methods for RL agents.
2. Part of the MARMot Lab at NUS, worked with Guillaume Sartoretti on applying multi-agent reinforcement learning to traffic signal control and multi-agent pathfinding.

Education

Third year Ph.D. student at MIT, advised by Jacob Andreas.

Background

Research interests lie at the intersection of reinforcement learning (RL) and large language models (LLMs). Focused on using RL to improve reasoning, math, coding, and other capabilities in LLMs, reducing hallucinations in LLMs, and thinking about how optimally selecting inference-time techniques can significantly improve the efficiency of LLMs.

Miscellany

Always excited to explore new research directions and open to collaborating or advising students. If you are interested in my research or simply want to chat, don't hesitate to get in touch!

Co-authors

1 total

Guillaume Sartoretti

Assistant Professor, National University of Singapore (NUS), Mechanical Engineering Dpt