Training Proactive and Personalized LLM Agents

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Prior intelligent agent research predominantly optimizes for task success rate, neglecting the joint optimization of proactivity (e.g., autonomous question-asking) and personalization (i.e., adaptation to user preferences). Method: We introduce UserVille—a novel simulation environment—and PPP, a multi-objective reinforcement learning framework that jointly models productivity, proactivity, and personalization. PPP integrates an LLM-driven user simulator, a strategic question-generation mechanism, and an interactive multi-objective training paradigm, enabling zero-shot adaptation to unseen user preferences. Contribution/Results: Evaluated on software engineering and deep-research tasks, our approach significantly outperforms strong baselines including GPT-5, achieving an average gain of 21.6 points. It simultaneously improves both task success rate and user preference alignment, establishing new state-of-the-art performance in proactive, personalized agent behavior.

Technology Category

Application Category

📝 Abstract

While existing work focuses primarily on task success, we argue that effective real-world agents require optimizing three dimensions: productivity (task completion), proactivity (asking essential questions), and personalization (adapting to diverse user preferences). We introduce UserVille, an interactive environment with LLM-based user simulators enabling diverse, configurable user preferences. Leveraging UserVille, we introduce PPP, a multi-objective reinforcement learning approach that jointly optimizes all three dimensions: Productivity, Proactivity, and Personalization. Experiments on software engineering and deep research tasks show that agents trained with PPP achieve substantial improvements over strong baselines such as GPT-5 (+21.6 on average), demonstrating the ability to ask strategic clarifying questions, adapt to unseen user preferences, and improve task success through better interaction. This work demonstrates that explicitly optimizing for user-centered interaction is critical for building practical and effective AI agents.

Problem

Research questions and friction points this paper is trying to address.

Optimizing productivity, proactivity, and personalization in agents

Training agents to ask strategic questions and adapt preferences

Improving task success through user-centered interaction design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective reinforcement learning optimizes productivity, proactivity, personalization

Interactive environment with LLM simulators enables configurable user preferences

Agents adapt to unseen preferences and ask strategic clarifying questions

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies

2024-07-28arXiv.orgCitations: 62

OpenAI

$380K – $445K • Offers Equity

San Francisco, CA, USA

Research Engineer, Language - Personalization, Meta Superintelligence Labs