🤖 AI Summary
To address the high cost of endpoint annotation in Wi-Fi trajectory localization, this paper proposes a semi-self-supervised trajectory labeling method that automatically annotates large-scale crowdsourced Wi-Fi trajectories using only a small number of endpoint-annotated trajectories—whose quantity scales linearly with venue size. Methodologically, we introduce a novel “cut-and-flip” temporal augmentation strategy and a “meet-in-the-middle” two-stage joint representation learning framework. These components collaboratively optimize trajectory embeddings, endpoint embeddings, and a neural localization network, enabling trajectory–endpoint joint self-supervised modeling. Experiments demonstrate that our approach reduces manual annotation effort by over 90% while maintaining localization accuracy comparable to fully supervised methods. The framework thus enables end-to-end automatic annotation and high-precision localization for large-scale Wi-Fi trajectory data.
📝 Abstract
WiFi fingerprint-based localization has been studied intensively. Point-based solutions rely on position annotations of WiFi fingerprints. Trajectory-based solutions, however, require end-position annotations of WiFi trajectories, where a WiFi trajectory is a multivariate time series of signal features. A trajectory dataset is much larger than a pointwise dataset as the number of potential trajectories in a field may grow exponentially with respect to the size of the field. This work presents a semi-self representation learning solution, where a large dataset $C$ of crowdsourced unlabeled WiFi trajectories can be automatically labeled by a much smaller dataset $ ilde C$ of labeled WiFi trajectories. The size of $ ilde C$ only needs to be proportional to the size of the physical field, while the unlabeled $C$ could be much larger. This is made possible through a novel ``cut-and-flip'' augmentation scheme based on the meet-in-the-middle paradigm. A two-stage learning consisting of trajectory embedding followed by endpoint embedding is proposed for the unlabeled $C$. Then the learned representations are labeled by $ ilde C$ and connected to a neural-based localization network. The result, while delivering promising accuracy, significantly relieves the burden of human annotations for trajectory-based localization.