Analyzing Finnish Inflectional Classes through Discriminative Lexicon and Deep Learning Models

📅 2025-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether Finnish noun inflection requires presupposed inflectional classes, addressing both computational modeling feasibility and the cognitive reality of such classes in native speaker acquisition. Method: We integrate the Discriminative Lexical Model (DLM) with deep learning for end-to-end joint modeling of type and token frequencies, thereby disentangling frequency effects from class productivity. Contribution/Results: The model achieves near-saturation accuracy on training data and generalizes robustly on test data, adhering to the natural productivity distribution of inflectional classes—particularly excelling on unseen forms of highly productive classes. While high-frequency tokens predominantly drive baseline performance, explicit modeling of inflectional classes proves unnecessary for accurate morphological parsing and generation. These findings challenge the theoretical necessity of pre-specified classes and provide novel empirical support for the cognitive accessibility—or lack thereof—of inflectional categories in native language processing.

Technology Category

Application Category

📝 Abstract
Descriptions of complex nominal or verbal systems make use of inflectional classes. Inflectional classes bring together nouns which have similar stem changes and use similar exponents in their paradigms. Although inflectional classes can be very useful for language teaching as well as for setting up finite state morphological systems, it is unclear whether inflectional classes are cognitively real, in the sense that native speakers would need to discover these classes in order to learn how to properly inflect the nouns of their language. This study investigates whether the Discriminative Lexicon Model (DLM) can understand and produce Finnish inflected nouns without setting up inflectional classes, using a dataset with 55,271 inflected nouns of 2000 high-frequency Finnish nouns from 49 inflectional classes. Several DLM comprehension and production models were set up. Some models were not informed about frequency of use, and provide insight into learnability with infinite exposure (endstate learning). Other models were set up from a usage based perspective, and were trained with token frequencies being taken into consideration (frequency-informed learning). On training data, models performed with very high accuracies. For held-out test data, accuracies decreased, as expected, but remained acceptable. Across most models, performance increased for inflectional classes with more types, more lower-frequency words, and more hapax legomena, mirroring the productivity of the inflectional classes. The model struggles more with novel forms of unproductive and less productive classes, and performs far better for unseen forms belonging to productive classes. However, for usage-based production models, frequency was the dominant predictor of model performance, and correlations with measures of productivity were tenuous or absent.
Problem

Research questions and friction points this paper is trying to address.

Investigating cognitive reality of inflectional classes in language learning
Testing Discriminative Lexicon Model on Finnish noun inflection without classes
Assessing model performance across productive and unproductive inflection patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discriminative Lexicon Model without inflectional classes
Deep learning models with token frequency training
Comprehension and production of Finnish inflections
🔎 Similar Papers
No similar papers found.
A
Alexandre Nikolaev
University of Eastern Finland
Yu-Ying Chuang
Yu-Ying Chuang
National Taiwan Normal University
linguisticsstatisticsphoneticslanguage acquisition
R
R. Harald Baayen
University of Tübingen