🤖 AI Summary
This study addresses the automatic parsing of English sentences into Unified Meaning Representations (UMRs), a graph-based semantic formalism critical for large-scale semantic annotation, low-resource language processing, and interpretable semantic analysis. We propose two text-to-UMR parsing approaches: (1) fine-tuning an Abstract Meaning Representation (AMR) parser, and (2) integrating universal dependency parsing with semantic structure mapping. These are unified within an end-to-end framework, SETUP. To enhance structural fidelity and alignment accuracy, we introduce a joint optimization objective combining AnCast (for node-level concept alignment) and SMATCH++ (for graph-level structural matching). Evaluated on standard benchmarks, our models achieve scores of 84 and 91, respectively—substantially outperforming prior baselines. This work constitutes the first high-accuracy, scalable solution for automatic UMR parsing, establishing a practical and efficient pathway toward cross-lingual semantic unification.
📝 Abstract
Uniform Meaning Representation (UMR) is a novel graph-based semantic representation which captures the core meaning of a text, with flexibility incorporated into the annotation schema such that the breadth of the world's languages can be annotated (including low-resource languages). While UMR shows promise in enabling language documentation, improving low-resource language technologies, and adding interpretability, the downstream applications of UMR can only be fully explored when text-to-UMR parsers enable the automatic large-scale production of accurate UMR graphs at test time. Prior work on text-to-UMR parsing is limited to date. In this paper, we introduce two methods for English text-to-UMR parsing, one of which fine-tunes existing parsers for Abstract Meaning Representation and the other, which leverages a converter from Universal Dependencies, using prior work as a baseline. Our best-performing model, which we call SETUP, achieves an AnCast score of 84 and a SMATCH++ score of 91, indicating substantial gains towards automatic UMR parsing.