1. Paper 'RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs', NeurIPS 2025, LAW Workshop. 2. Paper 'Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains', NeurIPS 2025, 5th Workshop on Mathematical Reasoning and AI. 3. Paper 'Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!', NeurIPS 2025, CogInterp Workshop. 4. Patent 'System and Method for Intelligent Scheduling of Manufacturing Jobs', US Patent Application US20240319718A1. 5. Paper 'Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning', ICRA 2022, NIPS Deep RL Workshop 2021, NIPS Offline RL Workshop 2021. 6. Master's Thesis 'Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL', 2021.
Research Experience
1. Worked as an ML Research Engineer at Quantiphi, Bangalore, applying reinforcement learning to workforce optimization. 2. Served as a Research Fellow at Microsoft Research India, focusing on reinforcement learning algorithms for energy grids. 3. Contributed as an AI Engineer at AI Labs, Hyderabad, developing a quadrupedal controller and building object-detection models. 4. Studied reinforcement learning and stochastic approximation in the Stochastic Systems Lab and the Stochastic Robotics Lab.
Education
Master's degree: Computer Science and Engineering, Indian Institute of Science (IISc) Bangalore, advised by Prof. Shishir Kolathaya and Prof. Shalabh Bhatnagar; Bachelor's degree: Civil Engineering, National Institute of Technology, Warangal (NITW).
Background
Research interests: Artificial Intelligence, Reinforcement Learning, Large Language Models, Machine Learning, Optimization, and Deep Learning. Professional fields: Reinforcement Learning, Machine Learning, Optimization, Robotics.