Urban Safety Perception Through the Lens of Large Multimodal Models: A Persona-based Approach

📅 2025-03-01

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

Traditional survey-based urban safety perception assessment suffers from high costs and poor scalability. Method: This study proposes a zero-shot paradigm leveraging large multimodal models (LMMs), specifically LLaVA-1.6-7B, augmented with a novel *Persona prompting* mechanism to simulate visual judgments of diverse demographic groups—varying by age, gender, and nationality—without fine-tuning, enabling binary safe/unsafe classification of street-view imagery. Contribution/Results: We identify an inherent bias toward middle-aged male perspectives in the base model; achieve a zero-shot average F1-score of 59.21%; uncover isolation, physical deterioration, and infrastructure deficiencies as key drivers of perceived unsafety; and observe substantial cross-national variation in unsafe classification rates (19.71%–40.15%). To our knowledge, this is the first work to explicitly integrate sociodemographic perspectives into LMM inference, establishing a new methodological foundation for fairness-aware, AI-driven urban perception research.

Technology Category

Application Category

📝 Abstract

Understanding how urban environments are perceived in terms of safety is crucial for urban planning and policymaking. Traditional methods like surveys are limited by high cost, required time, and scalability issues. To overcome these challenges, this study introduces Large Multimodal Models (LMMs), specifically Llava 1.6 7B, as a novel approach to assess safety perceptions of urban spaces using street-view images. In addition, the research investigated how this task is affected by different socio-demographic perspectives, simulated by the model through Persona-based prompts. Without additional fine-tuning, the model achieved an average F1-score of 59.21% in classifying urban scenarios as safe or unsafe, identifying three key drivers of perceived unsafety: isolation, physical decay, and urban infrastructural challenges. Moreover, incorporating Persona-based prompts revealed significant variations in safety perceptions across the socio-demographic groups of age, gender, and nationality. Elder and female Personas consistently perceive higher levels of unsafety than younger or male Personas. Similarly, nationality-specific differences were evident in the proportion of unsafe classifications ranging from 19.71% in Singapore to 40.15% in Botswana. Notably, the model's default configuration aligned most closely with a middle-aged, male Persona. These findings highlight the potential of LMMs as a scalable and cost-effective alternative to traditional methods for urban safety perceptions. While the sensitivity of these models to socio-demographic factors underscores the need for thoughtful deployment, their ability to provide nuanced perspectives makes them a promising tool for AI-driven urban planning.

Problem

Research questions and friction points this paper is trying to address.

Assessing urban safety perceptions using Large Multimodal Models (LMMs).

Investigating socio-demographic impacts on safety perception via Persona-based prompts.

Identifying key drivers of perceived unsafety: isolation, decay, and infrastructure.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Large Multimodal Models for urban safety analysis.

Employs Persona-based prompts for socio-demographic insights.

Achieves 59.21% F1-score in safety classification.

🔎 Similar Papers

Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images