Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This study addresses the limited understanding of behavioral dynamics in long-term human interactions with large language models. Leveraging longitudinal conversation logs from approximately 12,000 Bing Copilot users and conducting cross-dataset comparative analysis with the WildChat-4.8M dataset, the authors employ behavioral trajectory modeling and statistical comparison methods to demonstrate, for the first time, that individual interaction patterns exhibit high stability over time. The analysis further reveals that active users significantly outperform average users in both task complexity and usage effectiveness. Additionally, the study highlights a pronounced bias in WildChat toward highly skilled users, suggesting it poorly represents typical human–AI interaction scenarios.

📝 Abstract

Although a growing body of research has begun to describe user--LLM interactions, the picture it paints is largely static; little is known about how individual users change their behavior over time. To address this gap, we analyze the conversational trajectories of $\sim$12,000 randomly sampled Microsoft Bing Copilot users and compare these with data from WildChat-4.8M. While the Copilot data contains significant population-level trends, we find that trends in individual user trajectories are much weaker; user habits prove to be overwhelmingly sticky. We also find stark differences between users of different activity levels: more active users have more successful conversations and use the LLM for more complex and professionally oriented tasks. Some user trends also appear in WildChat-4.8M, but we find evidence that this dataset is significantly skewed towards highly proficient "power" users. Ultimately, our results suggest that existing user behavior is difficult to change and demonstrate the extent of user heterogeneity. Our comparison between datasets highlights that WildChat does not represent typical user-AI interactions, an important caveat for downstream uses of the data.

Problem

Research questions and friction points this paper is trying to address.

user behavior

longitudinal analysis

LLM interactions

behavioral change

user heterogeneity

Innovation

Methods, ideas, or system contributions that make the work stand out.

longitudinal analysis

user behavior

LLM interaction