Subhojyoti Mukherjee
Scholar

Subhojyoti Mukherjee

Google Scholar ID: VFixSK8AAAAJ
Adobe Research
Multi-armed BanditsReinforcement LearningLarge Language ModelsRLHF
Citations & Impact
All-time
Citations
274
 
H-index
10
 
i10-index
12
 
Publications
20
 
Co-authors
29
list available
Resume (English only)
Academic Achievements
  • - Paper 'Learning to Clarify by Reinforcement Learning Through Reward-Weighted Fine-Tuning' accepted at NeurIPS 2025 (main conference).
  • - Paper 'From Selection to Generation: A Survey of LLM-based Active Learning' accepted at ACL 2025 (main conference).
  • - Paper 'Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning' accepted at RLC 2025 (main conference).
Research Experience
  • - Adobe Research (San Jose): Research Scientist/Engineer (Mar 2025 - Present). Involved in pre-training and post-training of small LMs for Adobe Document Cloud; contributed to the English Document Overview model in Acrobat Reader; worked on the AI Assistant project in Adobe Express.
  • - Amazon AWS AI (Santa Clara, USA): Summer 2024 (full-time), hosted by Branislav Kveton et al., Area of Research: Multi-objective alignment for LLMs.
  • - Amazon AWS AI (Santa Clara, USA): Fall 2023 (part-time), hosted by Branislav Kveton et al., Area of Research: RLHF with LLMs.
  • - Amazon AWS AI (Santa Clara, USA): Summer 2023 (full-time), hosted by Branislav Kveton et al., Area of Research: Active In-Context Learning with LLMs.
  • - CMU, ECE Dept. (Pittsburgh, USA): Summer 2019, hosted by Prof. Gauri Joshi, Area of Research: Structured Bandits.
  • - Adobe Research (San Jose, USA): Spring 2018, hosted by Branislav Kveton, Area of Research: Item recommendation with Ranking and Bandits.
  • - INRIA, SequeL Lab (Lille, France): Fall 2017, hosted by Odalric Maillard, Area of Research: Non-stationary Bandits.
Education
  • - Ph.D.: Fall 2019 to Feb 2025, ECE, University of Wisconsin-Madison, advised by Dr. Robert Nowak, Dr. Josiah Hanna, and Dr. Qiaomin Xie. Areas of Research: Reinforcement Learning, Active Learning, incorporating deep active learning strategies for Large Language Models (LLMs), etc.
  • - M.S. by Research: 2015 to 2018, CSE, Indian Institute of Technology (IIT) Madras, advised by Dr. Balaraman Ravindran and Dr. Nandan Sudarsanam. Areas of Research: Reinforcement learning, Multi-Armed Bandit settings.
  • - Bachelor of Technology: 2009 to 2013, Dept. of Computer Science and Engineering, Meghnad Saha Institute of Technology, Kolkata, under West Bengal University of Technology, India.
Background
  • Research interests include training machine learning models, reinforcement learning, fine-tuning and alignment of large language models (LLMs). Serves as a research scientist at Adobe Research, focusing on pre-training and post-training of small language models.
Miscellany
  • No detailed information provided about personal interests.