π€ AI Summary
This study addresses unsupervised individual identification in free-living settings. We propose a novel gait-based biometric paradigm leveraging wrist-worn accelerometer data. Using real-world, label-free, 80-Hz wrist acceleration recordings from 15,000 participants in the NHANES cohort (7 days per subject, >10 TB total), we introduce Adaptive Empirical Pattern Transformation (ADEPT)βthe first algorithm enabling high-accuracy automatic detection of daily walking bouts. We then transform temporal acceleration sequences into time-lag joint distribution images, yielding scalable, individual-specific gait fingerprints. Integrating deep representation learning with large-scale stratified cross-validation across a nationally representative population, our method achieves 96% identification accuracy; moreover, the true identity ranks within the top 1% of predictions with 96% probability. To our knowledge, this is the first work to demonstrate robust, large-scale, unlabeled, wrist-based gait biometrics in unconstrained real-world conditions.
π Abstract
We propose a method for identifying individuals based on their continuously monitored wrist-worn accelerometry during activities of daily living. The method consists of three steps: (1) using Adaptive Empirical Pattern Transformation (ADEPT), a highly specific method to identify walking; (2) transforming the accelerometry time series into an image that corresponds to the joint distribution of the time series and its lags; and (3) using the resulting images to construct a person-specific walking fingerprint. The method is applied to 15,000 individuals from the National Health and Nutrition Examination Survey (NHANES) with up to 7 days of wrist accelerometry data collected at 80 Hertz. The resulting dataset contains more than 10 terabytes, is roughly 2 to 3 orders of magnitude larger than previous datasets used for activity recognition, is collected in the free living environment, and does not contain labels for walking periods. Using extensive cross-validation studies, we show that our method is highly predictive and can be successfully extended to a large, heterogeneous sample representative of the U.S. population: in the highest-performing model, the correct participant is in the top 1% of predictions 96% of the time.