PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning

📅 2026-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses preference-driven decision-making in calendar conflicts—such as choosing to attend, reschedule, or decline overlapping meetings—by proposing PEARL, a reinforcement learning–based self-evolving language agent. PEARL incorporates an external memory module and a round-wise dynamic reward mechanism to enable continuous online modeling and adaptation to user preferences. As the first reinforcement learning framework specifically designed for calendar conflict resolution, PEARL significantly outperforms existing baselines on the newly introduced CalConflictBench benchmark, reducing the average error rate by 55% and achieving an error reduction ratio of 0.76, thereby demonstrating its effectiveness and generalization capability in long-horizon scenarios.

Technology Category

Application Category

📝 Abstract
Overlapping calendar invitations force busy professionals to repeatedly decide which meetings to attend, reschedule, or decline. We refer to this preference-driven decision process as calendar conflict resolution. Automating this decision process is crucial yet challenging. Scheduling logistics can drain hours, and human delegation often fails at scale, which motivates us to ask: Can we trust large language models (LLMs) or language agents to manage time? To enable a systematic study of this question, we introduce CalConflictBench, a benchmark for long-horizon calendar conflict resolution. In CalConflictBench, conflicts are presented to agents round-by-round over a calendar year, requiring them to infer and adapt to user preferences progressively. Our experiments show that current LLM agents perform poorly with high error rates, e.g., Qwen-3-30B-Think has an average error rate of 35%. To address this gap, we propose PEARL, a reinforcement-learning framework that (i) augments the language agent with an external preference memory that stores and updates inferred strategies (e.g., attendee priorities, topic importance, time/location preferences), and (ii) optimizes the agent with round-wise rewards that directly supervise decision correctness, ranking quality, and memory usage across rounds. Experiments on CalConflictBench show that PEARL achieves an error reduction rate of 0.76 and a 55% improvement in average error rate compared to the strongest baseline.
Problem

Research questions and friction points this paper is trying to address.

calendar conflict resolution
time management
preference-driven decision
large language model
scheduling automation
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
language agent
calendar conflict resolution
preference adaptation
external memory
🔎 Similar Papers
No similar papers found.