Scholar

Dongruo Zhou

Google Scholar ID: 1780wr0AAAAJ

Indiana University Bloomington

Machine Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,570

H-index

i10-index

Publications

Co-authors

list available

Contact

CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

11 items

FERA: Uncertainty-Aware Federated Reasoning for Large Language Models

2026

Cited

Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations

2026

Cited

On the Limits of Test-Time Compute: Sequential Reward Filtering for Better Inference

2025

Cited

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

2025

Cited

Instance-Dependent Continuous-Time Reinforcement Learning via Maximum Likelihood Estimation

2025

Cited

How to Provably Improve Return Conditioned Supervised Learning?

2025

Cited

Federated In-Context Learning: Iterative Refinement for Improved Answer Quality

2025

Cited

Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation

2025

Cited

Resume (English only)

Academic Achievements

Federated In-Context Learning: Iterative Refinement for Improved Answer Quality, ICML 2025
Provable Zero-Shot Generalization in Offline Reinforcement Learning, ICML 2025
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation, UAI 2025
Safe Decision Transformer with Learning-based Constraints, L4DC 2025
Breaking the log(1/Δ₂) Barrier: Better Batched Best Arm Identification with Adaptive Grids, ICLR 2025
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds, ICLR 2025
Variance-Dependent Regret Bounds for Non-stationary Linear Bandits, AISTATS 2025
Uncertainty-Aware Reward-Free Exploration with General Function Approximation, ICML 2024
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression, ICLR 2024

Background

Assistant Professor, Department of Computer Science, Indiana University Bloomington
Research interests focus on the foundations of machine learning, particularly sequential decision-making problems such as bandits and reinforcement learning (RL)
Interested in nearly optimal statistical complexity for RL with function approximation and provable contextual bandits with neural networks
Studies theoretical foundations of optimization algorithms for deep learning, including sample complexity of SGD-based methods and stochastic algorithms that escape saddle points
Recently exploring decision-making algorithms for complex structures (e.g., RL for large language models, hierarchical RL) from both algorithmic and systemic perspectives

Co-authors

25 total