π€ AI Summary
This study addresses the limited awareness of continuous temporal constraints in large language models (LLMs) during real-time dialogue, which hinders their ability to dynamically adjust strategies under genuine time limits. By designing a multi-agent negotiation task with strict deadlines, the work systematically compares model performance when provided with remaining time versus only total time allowance, revealing for the first time that strategic failures stem from deficient time tracking rather than inadequate reasoning capacity. The authors propose a reproducible evaluation framework integrating real-time temporal prompts and controlled experiments, validated across multiple mainstream LLMs. Results demonstrate that incorporating explicit time awareness substantially improves agreement ratesβe.g., increasing from 4% to 32% for GPT-5.1βand exceeds 95% under turn-based constraints, confirming that LLMs possess latent strategic capabilities but lack intrinsic temporal awareness.
π Abstract
Large Language Models (LLMs) generate text token-by-token in discrete time, yet real-world communication, from therapy sessions to business negotiations, critically depends on continuous time constraints. Current LLM architectures and evaluation protocols rarely test for temporal awareness under real-time deadlines. We use simulated negotiations between paired agents under strict deadlines to investigate how LLMs adjust their behavior in time-sensitive settings. In a control condition, agents know only the global time limit. In a time-aware condition, they receive remaining-time updates at each turn. Deal closure rates are substantially higher (32\% vs. 4\% for GPT-5.1) and offer acceptances are sixfold higher in the time-aware condition than in the control, suggesting LLMs struggle to internally track elapsed time. However, the same LLMs achieve near-perfect deal closure rates ($\geq$95\%) under turn-based limits, revealing the failure is in temporal tracking rather than strategic reasoning. These effects replicate across negotiation scenarios and models, illustrating a systematic lack of LLM time awareness that will constrain LLM deployment in many time-sensitive applications.