🤖 AI Summary
This study addresses the limitations of real-world human mobility data, which are often sparse and subject to participant bias, as well as the shortcomings of existing synthetic generation methods that struggle to balance realism with controllability. To overcome these challenges, this work proposes an integrated framework that unifies OpenStreetMap-based geographic simulation, genetic algorithm–driven parameter calibration, Patterns-of-Life behavioral modeling, structured data processing, and visualization analytics. The framework enables the generation of high-fidelity, scalable, individual-level synthetic trajectories. Empirical evaluation demonstrates that the resulting large-scale synthetic dataset closely matches real-world data in key statistical characteristics—such as daily trip frequency and activity radius—thereby providing a robust foundation for downstream modeling tasks and benchmarking studies.
📝 Abstract
Understanding individual-level human mobility is critical for a wide range of applications. As such, real-world trajectory datasets provide valuable insights into actual movement behaviors and patterns of life but are often constrained by data sparsity and participant bias. Synthetic data, by contrast, offers scalability and flexibility but frequently lacks realism. To address this gap, we introduce a comprehensive software pipeline for, generating, calibrating, processing, and visualizing large-scale individual-level human mobility datasets that combine the realism of empirical data with the control and extensibility of Patterns-of-Life simulations. Our system consists of four integrated components. (1) a data generation engine which constructs geographically grounded simulations using OpenStreetMap data to produce diverse mobility logs. (2) a genetic algorithm-based calibration module that fine-tunes simulation parameters to align with real-world mobility characteristics, such as daily trip counts and radius of gyration, enabling realistic behavioral modeling. (3) a data processing suite which transforms raw simulation logs into structured formats suitable for downstream applications, including model training and benchmarking, and (4) a visualization module that extracts key mobility patterns and insights from the processed datasets and presents them through intuitive visual analytics for improved interpretability.