🤖 AI Summary
Current large language model (LLM) alignment approaches predominantly rely on immediate preference signals, which fail to capture the dynamic nature of user preferences as they evolve over time and in response to real-world outcomes. This work proposes a longitudinal alignment framework implemented through BITE, a browser-based system that enables contextualized preference elicitation, context-triggered multi-episode reflective prompting, and progressively authorized behavioral data collection. We introduce the first longitudinal, context-aware human–LLM alignment evaluation paradigm that supports user-controlled data sharing and reveals significant discrepancies between immediate and delayed preferences. An empirical two-week study with eight participants demonstrates the limitations of single-episode evaluations across dimensions such as accuracy and relevance, thereby underscoring both the necessity and feasibility of dynamic alignment.
📝 Abstract
Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM interactions, prompts reflection across later decision points, and supports progressive, user-controlled consent for sharing behavioral data. Through a two week longitudinal deployment study with 8 participants, our approach surfaced differences between immediate and later user preferences in accuracy, relevance and other dimensions of the LLM output. Our findings highlight the limitations of single-moment preference datasets and underscore the importance of longitudinal methods for alignment evaluation in everyday use.