🤖 AI Summary
Existing trajectory-based demographic inference methods heavily rely on large-scale labeled datasets, resulting in poor interpretability and limited generalizability across diverse datasets and user populations. To address these limitations, we propose a zero-shot, unsupervised hierarchical chain-of-thought reasoning framework. Our method first converts raw trajectories into semantically rich natural language descriptions—such as activity logs and multi-scale visit summaries—and then employs structured prompting to guide large language models through multi-level cognitive reasoning. Crucially, it requires no trajectory annotations, thereby enhancing both interpretability and cross-domain generalization. Extensive evaluation on multiple real-world trajectory datasets demonstrates that our approach achieves state-of-the-art performance in zero-shot inference of diverse demographic attributes—including age, gender, and income—without any task-specific fine-tuning or labeled supervision.
📝 Abstract
Inferring demographic attributes such as age, sex, or income level from human mobility patterns enables critical applications such as targeted public health interventions, equitable urban planning, and personalized transportation services. Existing mobility-based demographic inference studies heavily rely on large-scale trajectory data with demographic labels, leading to limited interpretability and poor generalizability across different datasets and user groups. We propose HiCoTraj (Zero-Shot Demographic Reasoning via Hierarchical Chain-of-Thought Prompting from Trajectory), a framework that leverages LLMs' zero-shot learning and semantic understanding capabilities to perform demographic inference without labeled training data. HiCoTraj transforms trajectories into semantically rich, natural language representations by creating detailed activity chronicles and multi-scale visiting summaries. Then HiCoTraj uses a novel hierarchical chain of thought reasoning to systematically guide LLMs through three cognitive stages: factual feature extraction, behavioral pattern analysis, and demographic inference with structured output. This approach addresses the scarcity challenge of labeled demographic data while providing transparent reasoning chains. Experimental evaluation on real-world trajectory data demonstrates that HiCoTraj achieves competitive performance across multiple demographic attributes in zero-shot scenarios.