🤖 AI Summary
Traditional pedestrian counting methods fail to capture the quality of social interactions in urban streets. Method: We propose the first multimodal large language model framework integrating street-view image analysis with sociological theory—specifically Mehta’s social typology—to automatically classify passive, transient, and sustained social activities. By controlling for confounding variables (e.g., weather, time-of-day, pedestrian density), we establish robust associations between built-environment features—including sky view factor and green view index—and social activity types. Contribution/Results: We find that sky view factor positively correlates with all three social types; green view index significantly promotes sustained interactions; and areas with high perceived urban belonging exhibit higher frequencies of transient social activity. This interpretable, scalable, and cross-culturally applicable framework advances the study of built environment–social behavior relationships beyond mere headcounts, offering a novel paradigm for evidence-based urban sociology and environmental design research.
📝 Abstract
Designing socially active streets has long been a goal of urban planning, yet existing quantitative research largely measures pedestrian volume rather than the quality of social interactions. We hypothesize that street view imagery -- an inexpensive data source with global coverage -- contains latent social information that can be extracted and interpreted through established social science theory. As a proof of concept, we analyzed 2,998 street view images from 15 cities using a multimodal large language model guided by Mehta's taxonomy of passive, fleeting, and enduring sociability -- one illustrative example of a theory grounded in urban design that could be substituted or complemented by other sociological frameworks. We then used linear regression models, controlling for factors like weather, time of day, and pedestrian counts, to test whether the inferred sociability measures correlate with city-level place attachment scores from the World Values Survey and with environmental predictors (e.g., green, sky, and water view indices) derived from individual street view images. Results aligned with long-standing urban planning theory: the sky view index was associated with all three sociability types, the green view index predicted enduring sociability, and place attachment was positively associated with fleeting sociability. These results provide preliminary evidence that street view images can be used to infer relationships between specific types of social interactions and built environment variables. Further research could establish street view imagery as a scalable, privacy-preserving tool for studying urban sociability, enabling cross-cultural theory testing and evidence-based design of socially vibrant cities.