🤖 AI Summary
This study investigates the capability of large language models (LLMs) to predict user repurchase intervals under zero-shot settings and systematically examines how the amount of contextual information influences predictive performance. Comparing LLMs against statistical baselines (e.g., exponential distribution) and specialized machine learning models, the findings reveal that while LLMs outperform simple statistical approaches, they fall significantly short of dedicated predictive models. Moreover, moderately increasing contextual detail improves LLM performance, yet excessive user-specific information leads to degradation—a counterintuitive phenomenon suggesting that “more context” is not always beneficial. These results challenge prevailing assumptions about context utilization in LLM-based forecasting and offer new directions for hybrid modeling frameworks that integrate statistical rigor with linguistic flexibility.
📝 Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities in reasoning and prediction across different domains. Yet, their ability to infer temporal regularities from structured behavioral data remains underexplored. This paper presents a systematic study investigating whether LLMs can predict time intervals between recurring user actions, such as repeated purchases, and how different levels of contextual information shape their predictive behavior. Using a simple but representative repurchase scenario, we benchmark state-of-the-art LLMs in zero-shot settings against both statistical and machine-learning models. Two key findings emerge. First, while LLMs surpass lightweight statistical baselines, they consistently underperform dedicated machine-learning models, showing their limited ability to capture quantitative temporal structure. Second, although moderate context can improve LLM accuracy, adding further user-level detail degrades performance. These results challenge the assumption that"more context leads to better reasoning". Our study highlights fundamental limitations of today's LLMs in structured temporal inference and offers guidance for designing future context-aware hybrid models that integrate statistical precision with linguistic flexibility.