đ€ AI Summary
This work presents the first systematic evaluation of large language models (LLMs) on city-scale stochastic street address generation in European urban contexts. Focusing on Berlin, Paris, Rome, and Warsaw, the study quantitatively assesses generated addresses along three dimensions: geographic accuracy, syntactic validity, and statistical randomnessâusing real-world address datasets as ground truth. Methodologically, it integrates prompt engineering, pattern-based reasoning, and empirical benchmarking, deliberately avoiding hand-crafted rules. Results show that while LLMs produce syntactically correct addresses partially aligned with empirical distributions, they exhibit substantial geographic inaccuraciesâincluding fabricated postal codes and cross-district streetâdistrict mismatchesâas well as spurious randomness, manifesting as repetitive structural patterns. The study delineates the implicit limits of LLMsâ internal modeling of structured geospatial knowledge and establishes a reproducible, multi-dimensional evaluation framework for geospatial text generation. It further provides a critical baseline for advancing trustworthy LLM deployment in location intelligence applications.
đ Abstract
Large Language Models (LLMs) are capable of solving complex math problems or answer difficult questions on almost any topic, but can they generate random street addresses for European cities?