🤖 AI Summary
Korean verb subcategorization frames (SFs) suffer from severe scarcity and poor structural consistency, hindering lexical resource development and NLP applications. Method: We introduce the first integrated Korean verb knowledge platform: (1) a structured verb lexicon derived from the Sejong Dictionary with fine-grained SF–sentence alignment; (2) lightweight dependency parsing and semantic role labeling (SRL) interfaces for automated SF parsing and validation; and (3) a full-stack React+Python web interface alongside an open-source Python toolkit. Contribution/Results: This work constitutes the first systematic integration of Korean verbs’ lexical, syntactic, and semantic information, significantly improving SF annotation quality, consistency, and reusability. The platform substantially lowers barriers to Korean verb knowledge acquisition and NLP model development, and has already enabled multiple downstream tasks—including predicate-argument structure modeling, grammatical error detection and correction, and constrained generation.
📝 Abstract
The Sejong dictionary dataset offers a valuable resource, providing extensive coverage of morphology, syntax, and semantic representation. This dataset can be utilized to explore linguistic information in greater depth. The labeled linguistic structures within this dataset form the basis for uncovering relationships between words and phrases and their associations with target verbs. This paper introduces a user-friendly web interface designed for the collection and consolidation of verb-related information, with a particular focus on subcategorization frames. Additionally, it outlines our efforts in mapping this information by aligning subcategorization frames with corresponding illustrative sentence examples. Furthermore, we provide a Python library that would simplify syntactic parsing and semantic role labeling. These tools are intended to assist individuals interested in harnessing the Sejong dictionary dataset to develop applications for Korean language processing.