ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
Existing conversational datasets primarily capture user utterances while lacking insight into their underlying cognitive processes—such as question intent and response reactions—thereby limiting our understanding of human–AI interaction mechanisms. To address this gap, this work introduces ThoughtTrace, the first large-scale dataset of multi-turn human–AI dialogues paired with users’ self-reported thoughts, encompassing 1,058 participants, 2,155 dialogues, and 10,174 thought annotations. By incorporating “user thoughts” as a novel modality, the study reveals their semantic distinctiveness and stage-dependent characteristics, and leverages them to develop thought-augmented models for behavior prediction and response rewriting. Experimental results demonstrate that thought annotations substantially improve behavior prediction accuracy and provide fine-grained supervisory signals for personalized alignment, thereby opening new avenues for modeling dynamic human cognition in interactive settings.
📝 Abstract
Conversational AI has now reached billions of users, yet existing datasets capture only what people say, not what they think. We introduce ThoughtTrace, the first large-scale dataset that pairs real-world multi-turn human--AI conversations with users' self-reported thoughts: their reasons for sending prompts and reactions to assistant responses. ThoughtTrace comprises 1,058 users, 2,155 conversations, 17,058 turns, and 10,174 thought annotations collected across 20 language models. Our analysis shows that ThoughtTrace captures long-horizon, topically diverse interactions, and that thoughts are semantically distinct from messages, difficult for frontier LLMs to infer from context, diverse in content, and tied to conversation stages. We further demonstrate the utility of thoughts for downstream modeling. First, thoughts improve user-behavior prediction as inference-time context. Second, thought-guided rewrites provide fine-grained alignment signals for training personalized assistants. Together, ThoughtTrace establishes user thoughts as a new data modality for studying the cognitive dynamics behind human--AI interaction and provides a foundation for building assistants that better understand and adapt to users' latent goals, preferences, and needs.
Problem

Research questions and friction points this paper is trying to address.

user thoughts
human-AI interaction
conversational AI
latent goals
cognitive dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

ThoughtTrace
user thoughts
human-AI interaction
cognitive dynamics
personalized assistants
🔎 Similar Papers
No similar papers found.