- Paper “Towards Two-Stage Counterfactual Learning to Rank” accepted at ICTIR 2025.
- Preprint of work “A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning” available.
- Full paper “Optimal Baseline Corrections for Off-policy Contextual Bandits” accepted at RecSys 2024 with an oral presentation.
- Full paper “Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank” accepted at CIKM 2024.
- PC member for ICML 2024, ICLR 2024, SIGIR 2024, RecSys 2024, and ICTIR 2024.
Research Experience
- Worked at Meta AI, developing a reinforcement learning approach for fine-tuning text-to-image diffusion models.
- Joined the Modern Recommender Systems team in NYC, working on off-policy learning for two-stage recommendation systems and mixture-of-experts architectures for video recommendations.
- Data scientist at Flipkart in India, working on search ranking and query understanding.
Education
- PhD, Information Retrieval Lab, University of Amsterdam, supervised by Prof. Maarten de Rijke and Prof. Harrie Oosterhuis
- Research Master's degree, Search and Information Extraction Lab, IIIT-Hyderabad, supervised by Dr. Manish Gupta and Prof. Vasudeva Varma
Background
Research interests span machine learning, information retrieval, off-policy methods for ranking and contextual bandits, and reinforcement learning for post-training of foundation models. Focus on improving recommendation and ranking systems by learning from user interactions.
Miscellany
Actively pursuing full-time positions as a Research Scientist, Research Engineer, or Machine Learning Engineer starting ASAP.