🤖 AI Summary
To address the scarcity of physical commonsense reasoning evaluation resources for non-English languages, this paper introduces FormaMentis—the first bilingual (Italian–English) benchmark deeply grounded in Italian linguistic and cultural specificity. Unlike generic machine translation, FormaMentis is manually curated by native Italian experts drawing on local lived experience and employs culture-preserving translation to retain culturally salient semantics while strictly adhering to the PIQA format. Its key contribution lies in the first systematic integration of physical commonsense reasoning evaluation with region-specific cultural knowledge, ensuring cultural semantic fidelity during cross-lingual transfer. The dataset has been submitted to the MRL 2025 Shared Task, providing a high-quality, culturally adapted evaluation resource for multilingual commonsense reasoning research—particularly for assessing AI systems’ physical understanding in low-resource languages.
📝 Abstract
This paper presents our submission to the MRL 2025 Shared Task on Multilingual Physical Reasoning Datasets. The objective of the shared task is to create manually-annotated evaluation data in the physical commonsense reasoning domain, for languages other than English, following a format similar to PIQA. Our contribution, FormaMentis, is a novel benchmark for physical commonsense reasoning that is grounded in Italian language and culture. The data samples in FormaMentis are created by expert annotators who are native Italian speakers and are familiar with local customs and norms. The samples are additionally translated into English, while preserving the cultural elements unique to the Italian context.