๐ค AI Summary
This study addresses the lack of effective methods for collecting real-world large language model (LLM) usage data under strict privacy constraints, which has hindered understanding of LLMsโ actual behaviors and alignment issues. The authors propose a participatory, privacy-first longitudinal measurement framework that automatically removes personally identifiable information (PII) on-device and employs a zero-retention architecture for raw data, coupled with personalized โyear-in-reviewโ reports to incentivize voluntary sharing of anonymized conversations. This novel paradigm uniquely integrates user incentives with storage-free observation. In an initial deployment involving 82 U.S. adults and 48,495 dialogues, the approach revealed dual patterns of instrumental and reflective LLM use, with heavy users showing a stronger tendency toward reflective interaction. The framework thus offers an ethical and scalable pathway for in-the-wild LLM research.
๐ Abstract
Alignment research on large language models (LLMs) increasingly depends on understanding how these systems are used in everyday contexts. yet naturalistic interaction data is difficult to access due to privacy constraints and platform control. We present AI-Wrapped, a prototype workflow for collecting naturalistic LLM usage data while providing participants with an immediate ``wrapped''-style report on their usage statistics, top topics, and safety-relevant behavioral patterns. We report findings from an initial deployment with 82 U.S.-based adults across 48,495 conversations from their 2025 histories. Participants used LLMs for both instrumental and reflective purposes, including creative work, professional tasks, and emotional or existential themes. Some usage patterns were consistent with potential over-reliance or perfectionistic refinement, while heavier users showed comparatively more reflective exchanges than primarily transactional ones. Methodologically, even with zero data retention and PII removal, participants may remain hesitant to share chat data due to perceived privacy and judgment risks, underscoring the importance of trust, agency, and transparent design when building measurement infrastructure for alignment research.