🤖 AI Summary
This study addresses the scarcity of high-quality public transit ridership and passenger demand data by integrating heterogeneous multi-source datasets—including bus GPS trajectories, 7.2 million fare transactions, route and stop information, weather records, urban infrastructure, and sociodemographic statistics—at the scale of a single city to construct a high-resolution, supply-and-demand-oriented public transit dataset. Rigorous data cleaning, anomaly detection, standardization, and differential privacy-based anonymization ensure both data quality and individual privacy. A controlled-access mechanism is implemented to balance open data sharing with privacy preservation. The resulting dataset enables research on transit efficiency evaluation, passenger flow forecasting, accessibility analysis, and weather impact assessment, thereby providing a robust foundation for intelligent urban transportation governance.
📝 Abstract
The NetMob Data Challenge releases a comprehensive public transportation dataset from Niterói, addressing the lack of high-quality mobility and passenger demand data. Based on operational records from March 2026, the dataset combines four main sources: GPS telemetry from buses, approximately 7.2 million ticketing transactions, auxiliary transit data (routes, stops, and weather), and urban infrastructure and socio-demographic information. Together, these sources provide a detailed view of both transit supply and passenger demand.
The data were preprocessed, cleaned, and anonymized to preserve privacy and improve reliability, including the removal of operational inconsistencies and anonymization of passenger identifiers. Access is restricted to challenge participants who accept the Terms and Conditions and sign an NDA.
The paper describes the data collection and preprocessing pipeline, dataset organization, and mobility patterns observed in the system. The dataset supports research on topics such as public transportation efficiency, demand forecasting, accessibility analysis, service reliability, and the influence of external factors like weather on urban mobility.