🤖 AI Summary
Medieval urban historical records suffer from sparse spatial information—often containing only relative directional relations (e.g., “X lies north of Y”) without precise coordinates. Method: This paper proposes a coordinate-free spatial partitioning approach, introducing the first **spatial adversarial network** for historical texts. It jointly leverages land registers and secondary historical sources, integrating multi-strategy relation extraction, graph embedding, and Louvain community detection to infer urban regions. Contributions/Results: (1) First application of adversarial learning to historical spatial modeling; (2) Empirical demonstration that selectively discarding noisy primary records while incorporating trusted secondary sources significantly improves distance approximation quality; (3) Formalization of an optimal trade-off criterion between coverage and geometric fidelity. Evaluated on Avignon papal-era data, the inferred communities align closely with historically attested urban districts. All data and code are publicly released, ensuring reproducibility and cross-city adaptability.
📝 Abstract
In historical studies, the older the sources, the more common it is to have access to data that are only partial, and/or unreliable or imprecise. This can make it difficult, or even impossible, to perform certain tasks of interest, such as the segmentation of some urban space based on the location of its constituting elements. Indeed, traditional approaches to tackle this specific task require knowing the position of all these elements before clustering them. Yet, alternative information is sometimes available, which can be leveraged to address this challenge. For instance, in the Middle Ages, land registries typically do not provide exact addresses, but rather locate spatial objects relative to each other, e.g. x being to the North of y. Spatial graphs are particularly adapted to model such spatial relationships, called confronts, which is why we propose their use over standard tabular databases. However, historical data are rich and allow extracting confront networks in many ways, making the process non-trivial. In this article, we propose several extraction methods and compare them to identify the most appropriate. We postulate that the best candidate must constitute an optimal trade-off between covering as much of the original data as possible, and providing the best graph-based approximation of spatial distance. Leveraging a dataset that describes Avignon during its papal period, we show empirically that the best results require ignoring some of the information present in the original historical sources, and that including additional information from secondary sources significantly improves the confront network. We illustrate the relevance of our method by partitioning the best graph that we extracted, and discussing its community structure in terms of urban space organization, from a historical perspective. Our data and source code are both publicly available online.