Determination of language families using deep learning

📅 2024-09-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of inferring language genealogical relationships solely from transcription-based textual fragments—without relying on translation or decipherment—particularly for extant languages, deciphered ancient languages, and the undeciphered Cypro-Minoan script. Method: We propose the first end-to-end framework leveraging convolutional generative adversarial networks (c-GANs) for language phylogeny modeling, integrating transcriptional text representations with unsupervised and weakly supervised representation learning to bypass traditional comparative linguistics’ dependence on cognate identification and manually constructed sound-change rules. Contribution/Results: Our approach successfully reconstructs established phylogenetic structures across multiple ancient languages and yields the first deep learning–based genealogical attribution for Cypro-Minoan, suggesting its plausible affiliation with the Aegean/Anatolian linguistic sphere. The method is both interpretable and scalable, offering a novel paradigm for phylogenetic placement and subsequent decipherment of undeciphered scripts.

Technology Category

Application Category

📝 Abstract
We use a c-GAN (convolutional generative adversarial) neural network to analyze transliterated text fragments of extant, dead comprehensible, and one dead non-deciphered (Cypro-Minoan) language to establish linguistic affinities. The paper is agnostic with respect to translation and/or deciphering. However, there is hope that the proposed approach can be useful for decipherment with more sophisticated neural network techniques.
Problem

Research questions and friction points this paper is trying to address.

Identifying language families via deep learning analysis
Analyzing transliterated texts of extant and dead languages
Exploring linguistic affinities without translation or decipherment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses c-GAN for language family analysis
Analyzes transliterated text fragments
Agnostic to translation or deciphering
🔎 Similar Papers
No similar papers found.
P
Peter B. Lerner
Unaffiliated