After Talking with 1,000 Personas: Learning Preference-Aligned Proactive Assistants From Large-Scale Persona Interactions

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses the challenge of maintaining user trust in proactive intelligent assistants, which is often undermined by poorly timed or intrusive interventions. Given the scarcity of real-world user preference data and the difficulty of capturing cross-session dynamics, the authors propose a group-to-individual preference alignment framework. Operating under on-device privacy constraints, the approach learns shared preference structures—spanning timing, autonomy, and communication style—through multi-session interactions with 1,000 diverse LLM-generated virtual personas. Personalization is achieved via lightweight activation-based feedback without requiring model retraining or cloud updates. Experiments demonstrate that the method significantly outperforms baselines in both simulated environments and real-user studies (N=30), matching the performance of human-feedback-based reinforcement learning in timing accuracy and interaction quality, while steadily improving user satisfaction, trust, and comfort over successive interactions.

Technology Category

Application Category

📝 Abstract

Smart assistants increasingly act proactively, yet mistimed or intrusive behavior often causes users to lose trust and disable these features. Learning user preferences for proactive assistance is difficult because real-world studies are costly, limited in scale, and rarely capture how preferences change across multiple interaction sessions. Large language model based generative agents offer a way to simulate realistic interactions, but existing synthetic datasets remain limited in temporal depth, diverse personas, and multi-dimensional preferences. They also provide little support for transferring population-level insights to individual users under on-device constraints. We present a population-to-individual learning framework for preference-aligned proactive assistants that operates under on-device and privacy constraints. Our approach uses large-scale interaction simulation with 1,000 diverse personas to learn shared structure in how users express preferences across recurring dimensions such as timing, autonomy, and communication style, providing a strong cold start without relying on real user logs. The assistant then adapts to individual users on device through lightweight activation-based steering driven by simple interaction feedback, without model retraining or cloud-side updates. We evaluate the framework using controlled simulations with 1,000 simulated personas and a human-subject study with 30 participants. Results show improved timing decisions and perceived interaction quality over untuned and direct-response baselines, while on-device activation steering achieves performance comparable to reinforcement learning from human feedback. Participants also report higher satisfaction, trust, and comfort as the assistant adapts over multiple sessions of interactions.

Problem

Research questions and friction points this paper is trying to address.

proactive assistance

user preference learning

on-device adaptation

persona simulation

privacy constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

proactive assistant

persona simulation

on-device personalization