🤖 AI Summary
Modeling location-based social network (LBSN) check-in trajectories is challenging due to their spatial discreteness, temporal irregularity, activity sparsity, and inherent human mobility uncertainty. To address these issues, this paper proposes GeoGen—a two-stage coarse-to-fine generative framework. Its core innovation lies in mapping discrete, sparse check-in sequences into continuous latent motion trajectories, enabled by a sparse-aware spatiotemporal diffusion model (S²TDiff) and a Transformer-based Coarse2FineNet architecture that jointly models dynamic contextual cues, semantic correlations, and behavioral uncertainty. Crucially, GeoGen preserves user privacy while significantly improving both spatial fidelity and behavioral realism of generated trajectories. Extensive experiments on four real-world datasets demonstrate state-of-the-art performance: on FS-TKY, it reduces distance error and radius error by 69% and 55%, respectively, outperforming all existing methods.
📝 Abstract
Location-Based Social Network (LBSN) check-in trajectory data are important for many practical applications, like POI recommendation, advertising, and pandemic intervention. However, the high collection costs and ever-increasing privacy concerns prevent us from accessing large-scale LBSN trajectory data. The recent advances in synthetic data generation provide us with a new opportunity to achieve this, which utilizes generative AI to generate synthetic data that preserves the characteristics of real data while ensuring privacy protection. However, generating synthetic LBSN check-in trajectories remains challenging due to their spatially discrete, temporally irregular nature and the complex spatio-temporal patterns caused by sparse activities and uncertain human mobility. To address this challenge, we propose GeoGen, a two-stage coarse-to-fine framework for large-scale LBSN check-in trajectory generation. In the first stage, we reconstruct spatially continuous, temporally regular latent movement sequences from the original LBSN check-in trajectories and then design a Sparsity-aware Spatio-temporal Diffusion model (S$^2$TDiff) with an efficient denosing network to learn their underlying behavioral patterns. In the second stage, we design Coarse2FineNet, a Transformer-based Seq2Seq architecture equipped with a dynamic context fusion mechanism in the encoder and a multi-task hybrid-head decoder, which generates fine-grained LBSN trajectories based on coarse-grained latent movement sequences by modeling semantic relevance and behavioral uncertainty. Extensive experiments on four real-world datasets show that GeoGen excels state-of-the-art models for both fidelity and utility evaluation, e.g., it increases over 69% and 55% in distance and radius metrics on the FS-TKY dataset.