Can a Large Language Model Assess Urban Design Quality? Evaluating Walkability Metrics Across Expertise Levels

📅 2025-04-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether multimodal large language models (MLLMs)—particularly GPT-4V—can reliably assess urban walkability (e.g., safety, visual appeal) and proposes an ontology-driven expert knowledge injection method. Current MLLMs suffer from over-optimistic scoring, misinterpretation of walkability metrics, and poor cross-sample consistency due to overreliance on generic knowledge. To address this, we design a structured prompt engineering framework that explicitly embeds domain-specific urban design ontologies and standardized walkability evaluation criteria into the model’s reasoning process. Our experiments provide the first systematic validation that expert knowledge injection significantly improves MLLM assessment consistency (Krippendorff’s α increases by 28%), reduces error rates (by 37%), and enhances professional semantic understanding. We further identify a synergistic effect between semantic clarity and domain expertise in prompts, demonstrating how their joint integration amplifies evaluation reliability and validity.

Technology Category

Application Category

📝 Abstract
Urban street environments are vital to supporting human activity in public spaces. The emergence of big data, such as street view images (SVIs) combined with multimodal large language models (MLLMs), is transforming how researchers and practitioners investigate, measure, and evaluate semantic and visual elements of urban environments. Considering the low threshold for creating automated evaluative workflows using MLLMs, it is crucial to explore both the risks and opportunities associated with these probabilistic models. In particular, the extent to which the integration of expert knowledge can influence the performance of MLLMs in evaluating the quality of urban design has not been fully explored. This study sets out an initial exploration of how integrating more formal and structured representations of expert urban design knowledge into the input prompts of an MLLM (ChatGPT-4) can enhance the model's capability and reliability in evaluating the walkability of built environments using SVIs. We collect walkability metrics from the existing literature and categorize them using relevant ontologies. We then select a subset of these metrics, focusing on the subthemes of pedestrian safety and attractiveness, and develop prompts for the MLLM accordingly. We analyze the MLLM's ability to evaluate SVI walkability subthemes through prompts with varying levels of clarity and specificity regarding evaluation criteria. Our experiments demonstrate that MLLMs are capable of providing assessments and interpretations based on general knowledge and can support the automation of multimodal image-text evaluations. However, they generally provide more optimistic scores and can make mistakes when interpreting the provided metrics, resulting in incorrect evaluations. By integrating expert knowledge, the MLLM's evaluative performance exhibits higher consistency and concentration.
Problem

Research questions and friction points this paper is trying to address.

Evaluating walkability of urban streets using MLLMs and SVIs
Assessing impact of expert knowledge on MLLM urban design evaluations
Improving reliability of automated walkability metrics through structured prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multimodal large language models (MLLMs) for urban evaluation
Integrates expert knowledge into MLLM input prompts
Evaluates walkability via street view images (SVIs)
🔎 Similar Papers
No similar papers found.
C
Chenyi Cai
Singapore-ETH Centre, Future Cities Lab Global Programme, CREATE campus, 1 Create Way, #06-01 CREATE Tower, 138602, Singapore
K
Kosuke Kuriyama
Singapore-ETH Centre, Future Cities Lab Global Programme, CREATE campus, 1 Create Way, #06-01 CREATE Tower, 138602, Singapore; Takenaka Corporation, Project Development Division, 4-1-13 Hommachi, Chuo-Ku, Osaka, 541-0053, Japan
Youlong Gu
Youlong Gu
National University of Singapore
Urban PlanningUrban AnalyticsMachine Learning
Filip Biljecki
Filip Biljecki
Assistant Professor, Urban Analytics Lab, National University of Singapore
urban data scienceurban informaticsurban analyticsgeographic data scienceGeoAI
P
Pieter Herthogs
Singapore-ETH Centre, Future Cities Lab Global Programme, CREATE campus, 1 Create Way, #06-01 CREATE Tower, 138602, Singapore