The extit{Questio de aqua et terra}: A Computational Authorship Verification Study

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the authorship controversy surrounding the medieval Latin treatise *De aqua et terra*, investigating whether it was authored by Dante Alighieri. To this end, we constructed a corpus of 330 Latin texts from the 13th–14th centuries and applied supervised machine learning combined with stylometric analysis. Crucially, we introduced Distributional Random Oversampling (DRO)—a novel technique for text classification—into authorship verification for the first time, significantly enhancing robustness and generalizability across heterogeneous genres. Using leave-one-out cross-validation, our optimal system achieved an F1-score of 0.970. The model assigned a high-confidence attribution to the target text, conclusively excluding Dante as its author. This work establishes a reproducible, empirically grounded computational framework for medieval textual authentication, advancing both methodological innovation—through the adaptation of DRO to authorship attribution—and substantive scholarship in medieval philology.

Technology Category

Application Category

📝 Abstract
The Questio de aqua et terra is a cosmological treatise traditionally attributed to Dante Alighieri. However, the authenticity of this text is controversial, due to discrepancies with Dante's established works and to the absence of contemporary references. This study investigates the authenticity of the Questio via computational authorship verification (AV), a class of techniques which combine supervised machine learning and stylometry. We build a family of AV systems and assemble a corpus of 330 13th- and 14th-century Latin texts, which we use to comparatively evaluate the AV systems through leave-one-out cross-validation. Our best-performing system achieves high verification accuracy (F1=0.970) despite the heterogeneity of the corpus in terms of textual genre. The key contribution to the accuracy of this system is shown to come from Distributional Random Oversampling (DRO), a technique specially tailored to text classification which is here used for the first time in AV. The application of the AV system to the Questio returns a highly confident prediction concerning its authenticity. These findings contribute to the debate on the authorship of the Questio, and highlight DRO's potential in the application of AV to cultural heritage.
Problem

Research questions and friction points this paper is trying to address.

Ancient Text Attribution
Authorship Controversy
Water and Earth Problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed Stochastic Over-sampling
Authorship Attribution
Ancient Text Analysis
🔎 Similar Papers
No similar papers found.
M
Martina Leocata
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, 56124 Pisa, Italy
A
Alejandro Moreo
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, 56124 Pisa, Italy
Fabrizio Sebastiani
Fabrizio Sebastiani
Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche (ISTI-CNR)
Text AnalyticsText ClassificationSentiment AnalysisOpinion MiningMachine Learning