Branislav Kveton
Scholar

Branislav Kveton

Google Scholar ID: CZaDvPgAAAAJ
Adobe Research
Artificial IntelligenceMachine Learning
Citations & Impact
All-time
Citations
4,162
 
H-index
36
 
i10-index
80
 
Publications
20
 
Co-authors
0
 
Contact
No contact links provided.
Resume (English only)
Academic Achievements
  • 2025. 'Personalization of Large Language Models: A Survey.' Transactions on Machine Learning Research.
  • 2025. 'GUI Agents: A Survey.' In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL).
  • 2025. 'From Selection to Generation: A Survey of LLM-based Active Learning.' ACL 2025.
  • 2025. 'Adaptive Submodular Policy Optimization.' In Proceedings of the 2nd Reinforcement Learning Conference (RLC).
  • 2025. 'FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain.' In Proceedings of the 42nd International Conference on Machine Learning (ICML).
Research Experience
  • 2024–present: Principal Research Scientist at Adobe Research.
  • 2021–2024: Research Scientist at Amazon.
  • 2018–2021: Research Scientist at Google Research.
  • 2014–2018: Research Scientist at Adobe Research.
  • 2011–2014: Research Scientist at Technicolor’s Research Center.
  • 2006–2011: Research Scientist at Intel Research.
Background
  • Proposes, analyzes, and applies algorithms that learn incrementally, run in real time, and converge to near-optimal solutions as observations increase.
  • Recent work focuses on applying these ideas to modern generative models and human feedback.
  • Studies seamless human-machine interaction—the holy grail of AI—traditionally approached via reinforcement learning and bandit frameworks.
  • Made fundamental contributions to the bandit field, especially in structured problems involving graphs, submodularity, semi-bandit feedback, and low-rank matrices.
  • Developed online learning-to-rank bandit algorithms capable of handling exponentially large action spaces and partial feedback; these are simple, theoretically sound, robust, and state-of-the-art.
  • Recent efforts enhance practicality of bandit algorithms via randomization-based exploration (compatible with neural networks) and reducing statistical complexity through meta-, multi-task, and federated learning.
  • Explores persistent challenges of exploration and statistically-efficient adaptivity in the era of pre-trained models, such as optimal experimental design for efficient LLM fine-tuning and off-policy evaluation using logged human feedback.
Co-authors
0 total
Co-authors: 0 (list not available)