Scholar

Sai Rajeswar

Google Scholar ID: h-sqIigAAAAJ

Staff Research Scientist, Adjunct Professor, Mila, ServiceNow

machine learninggenerative modelsreinforcement learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,970

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailrajsai24@gmail.com TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

24 items

R2V Agent: Teaching SLMs When to Ask for Help

2026

Cited

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

2026

Cited

Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning

2026

Cited

Therefore I am. I Think

2026

Cited

Terminal Agents Suffice for Enterprise Automation

2026

Cited

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

2026

Cited

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

2026

Cited

Grammar Search for Multi-Agent Systems

2025

Cited

Resume (English only)

Academic Achievements

Published multiple papers, including:
- AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding (NeurIPS 2025).
- WebMMU: A Multimodal, Multilingual Benchmark for Website Understanding & Code Generation (EMNLP 2025, Oral).
- BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning (CoLM 2025).
- UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction (ICML 2025).
- BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks (ICLR 2025).
- GenRL: Multimodal Foundation World Models for Generalist Embodied Agents (NeurIPS 2024).
- Rendering-Aware Reinforcement Learning for Vector Graphics Generation (NeurIPS 2025).
- StarVector: Generating Scalable Vector Graphics Code From Images And Text (CVPR 2025).

Research Experience

Currently a Staff Research Scientist at ServiceNow, and an Adjunct Professor and core industry member at Mila Montréal. During his Ph.D., he interned as a Research Scientist at Google DeepMind.

Education

Ph.D. from MILA, University of Montreal, supervised by Prof. Aaron Courville; Master's in Computer Science from IIT Delhi, recipient of Prof. A.K.Sinha best student award.

Background

Research interests span broadly over generative models and reinforcement learning, with a recent focus on multimodal perception and world representations, which are key for generalist AI systems that integrate perception and action while incorporating feedback from the environment.

Miscellany

If you are interested in his work and would like to explore fundamental research questions around these, collaborate or receive mentorship, feel free to reach out.

Co-authors

16 total