🤖 AI Summary
Existing mobile app prediction models rely on fixed vocabularies, limiting their generalization to new devices or ecosystems and degrading performance in cold-start scenarios. This work proposes a vocabulary-free Transformer-based prediction approach that employs a shuffle-tokenization mechanism to randomly reassign virtual indices, thereby masking true app identities while preserving behavioral semantics through ultra-long context modeling of user sequences. The method achieves, for the first time, zero-shot cross-dataset prediction, with theoretical guarantees that the predicted distribution converges to the true distribution given sufficiently long contexts. Experiments demonstrate strong zero-shot accuracy across two intercontinental datasets, cold-start performance on par with state-of-the-art methods, and inference latency low enough for practical deployment.
📝 Abstract
Predicting the next mobile application a user will launch is essential for intelligent device resource management and proactive assistance. Existing models rely on fixed app vocabularies, which prevents them from generalizing across different app ecosystems. Many also depend on user-specific knowledge, which complicates deployment in cold start scenarios. We propose STAP, a Transformer-based model that eliminates the need for a fixed vocabulary. STAP replaces true app identities with randomly reassigned virtual indices via a shuffle mechanism, and compensates for discarded semantic information by processing behavioral sequences with an ultra-long context design. A theoretical analysis shows that, given a sufficiently long context, the predicted distribution converges to the correct one despite the anonymity of the mapping. Experiments on two datasets from different continents demonstrate that STAP achieves strong cross-dataset zero-shot prediction accuracy -- a setting where all existing fixed-vocabulary methods are inherently inapplicable -- while its cold start performance within each dataset remains competitive with leading models. Furthermore, we introduce a deployment strategy that enables the model to retain a sufficiently long context during continuous inference while keeping latency within acceptable bounds.