DiversityOne: A Multi-Country Smartphone Sensor Dataset for Everyday Life Behavior Modeling

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing smartphone behavioral datasets suffer from narrow national coverage, small sample sizes, and limited sensor modalities, severely hindering cross-cultural behavioral modeling and rigorous evaluation of model generalizability. To address these limitations, we introduce the largest and most diverse publicly available multinational smartphone behavioral dataset to date: collected over four consecutive weeks from 782 university students across eight countries spanning the Global North and South. It comprises 26 types of raw sensor time-series data and over 350,000 fine-grained, context-aware self-reports, enriched with demographic, psychological, and socio-cultural metadata. This project pioneers standardized, cross-cultural collaborative data collection under a unified data governance protocol. The dataset substantially enhances reproducibility in cross-national behavioral modeling and enables robust domain adaptation research. As foundational infrastructure for ubiquitous computing, it advances empirical investigation of model robustness and generalization across culturally and geographically diverse populations.

Technology Category

Application Category

📝 Abstract
Understanding everyday life behavior of young adults through personal devices, e.g., smartphones and smartwatches, is key for various applications, from enhancing the user experience in mobile apps to enabling appropriate interventions in digital health apps. Towards this goal, previous studies have relied on datasets combining passive sensor data with human-provided annotations or self-reports. However, many existing datasets are limited in scope, often focusing on specific countries primarily in the Global North, involving a small number of participants, or using a limited range of pre-processed sensors. These limitations restrict the ability to capture cross-country variations of human behavior, including the possibility of studying model generalization, and robustness. To address this gap, we introduce DiversityOne, a dataset which spans eight countries (China, Denmark, India, Italy, Mexico, Mongolia, Paraguay, and the United Kingdom) and includes data from 782 college students over four weeks. DiversityOne contains data from 26 smartphone sensor modalities and 350K+ self-reports. As of today, it is one of the largest and most diverse publicly available datasets, while featuring extensive demographic and psychosocial survey data. DiversityOne opens the possibility of studying important research problems in ubiquitous computing, particularly in domain adaptation and generalization across countries, all research areas so far largely underexplored because of the lack of adequate datasets.
Problem

Research questions and friction points this paper is trying to address.

Cross-country behavior modeling
Smartphone sensor data diversity
Domain adaptation and generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-country smartphone sensor dataset
26 sensor modalities data collection
350K+ self-reports analysis
Matteo Busso
Matteo Busso
University of Trento
Research InfrastructurePrivacyService DesignSituational ContextEveryday AI
A
A. Bontempelli
University of Trento, Italy
Leonardo Malcotti
Leonardo Malcotti
Unknown affiliation
L
L. Meegahapola
ETH Zurich, Switzerland
Peter Kun
Peter Kun
Postdoc, IT University of Copenhagen
generative artificial intelligencecreative data workhuman-computer interaction
Shyam Diwakar
Shyam Diwakar
Amrita Mind Brain Center, Amrita Vishwa Vidyapeetham
NeuophysiologyComputational NeuroscienceCerebellumEEGvirtual laboratories
C
Chaitanya Nutakki
Amrita Vishwa Vidyapeetham, India
M
Marcelo Rodas Britez
University of Trento & FBK, Italy
H
Hao Xu
Jilin University, China
D
Donglei Song
Jilin University, China
S
Salvador Ruiz Correa
Instituto Potosino de Investigación Científica y Tecnológica, Mexico
A
A. Mendoza-Lara
Instituto Potosino de Investigación Científica y Tecnológica, Mexico
G
George Gaskell
London School of Economics and Political Science, UK
S
S. Stares
London School of Economics and Political Science, UK
M
Miriam Bidoglia
London School of Economics and Political Science, UK
A
A. Ganbold
National University of Mongolia, Mongolia
Altangerel Chagnaa
Altangerel Chagnaa
Associate Professor, Department of Information and Computer Sciences, National University of
Natural Language ProcessingMachine learning
Luca Cernuzzi
Luca Cernuzzi
Universidad Católica "Nuestra Señora de la Asunción"
Software EngineeringSocial Informatics
A
Alethia Hume
Universidad Católica "Nuestra Señora de la Asunción", Paraguay
R
Ronald Chenu-Abente
University of Trento, Italy
R
Roy Alia Asiku
University of Trento, Italy
I
Ivan Kayongo
University of Trento, Italy
D
D. Gatica-Perez
Idiap Research Institute & EPFL, Switzerland
A
A. D. Gotzen
Aalborg University, Denmark
Ivano Bison
Ivano Bison
Full Professor, Trento University
Computational Social ScienceData scienceSequences analysisSociologyResearch Methodology
Fausto Giunchiglia
Fausto Giunchiglia
Professor of Computer Science, Università di Trento
Computational theories of the mind