Unlocking Korean Verbs: A User-Friendly Exploration into the Verb Lexicon

📅 2024-10-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Korean verb subcategorization frames (SFs) suffer from severe scarcity and poor structural consistency, hindering lexical resource development and NLP applications. Method: We introduce the first integrated Korean verb knowledge platform: (1) a structured verb lexicon derived from the Sejong Dictionary with fine-grained SF–sentence alignment; (2) lightweight dependency parsing and semantic role labeling (SRL) interfaces for automated SF parsing and validation; and (3) a full-stack React+Python web interface alongside an open-source Python toolkit. Contribution/Results: This work constitutes the first systematic integration of Korean verbs’ lexical, syntactic, and semantic information, significantly improving SF annotation quality, consistency, and reusability. The platform substantially lowers barriers to Korean verb knowledge acquisition and NLP model development, and has already enabled multiple downstream tasks—including predicate-argument structure modeling, grammatical error detection and correction, and constrained generation.

Technology Category

Application Category

📝 Abstract
The Sejong dictionary dataset offers a valuable resource, providing extensive coverage of morphology, syntax, and semantic representation. This dataset can be utilized to explore linguistic information in greater depth. The labeled linguistic structures within this dataset form the basis for uncovering relationships between words and phrases and their associations with target verbs. This paper introduces a user-friendly web interface designed for the collection and consolidation of verb-related information, with a particular focus on subcategorization frames. Additionally, it outlines our efforts in mapping this information by aligning subcategorization frames with corresponding illustrative sentence examples. Furthermore, we provide a Python library that would simplify syntactic parsing and semantic role labeling. These tools are intended to assist individuals interested in harnessing the Sejong dictionary dataset to develop applications for Korean language processing.
Problem

Research questions and friction points this paper is trying to address.

Exploring Korean verb lexicon using Sejong dictionary dataset
Developing user-friendly tools for verb information collection
Simplifying syntactic parsing and semantic role labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

User-friendly web interface for verb data
Mapping subcategorization frames with sentences
Python library for syntactic parsing
🔎 Similar Papers
No similar papers found.
S
Seohyun Song
SeoulTech, South Korea
E
Eunkyul Leah Jo
The University of British Columbia, Canada
Yige Chen
Yige Chen
College of Computer Science and Artificial Intelligence, Wenzhou University
Networking
J
Jeen-Pyo Hong
42dot Inc., South Korea
K
Kyuwon Kim
Seoul National University, South Korea
J
Jin Wee
National Institute of Korean Language, South Korea
M
Miyoung Kang
National Institute of Korean Language, South Korea
K
Kyungtae Lim
SeoulTech, South Korea
J
Jungyeul Park
The University of British Columbia, Canada
C
Chulwoo Park
Anyang University, South Korea