🤖 AI Summary
This paper addresses the long-standing fragmentation in mainstream formal grammatical frameworks—phrase-structure grammar (PSG), dependency grammar (DG), and categorial grammar (CG)—in modeling discontinuous linguistic structures (e.g., split verbs, cross-constituent modification). We propose the first unified representational system that integrates PSG’s constituency, DG’s head-dependent relations, and CG’s functor-argument mechanism within a single categorical logic derivation framework. To jointly capture linear order and hierarchical dependency, we introduce a graph-structured representation. The system enables coherent modeling of discontinuity phenomena across typologically diverse languages—including Turkish and Japanese—demonstrating for the first time that both continuous and discontinuous structures admit a consistent analysis under one formal theory. Our work establishes a formally rigorous, scalable theoretical foundation for computational grammar modeling and for integrating neural and symbolic approaches in natural language processing.
📝 Abstract
Syntactic discontinuity is a grammatical phenomenon in which a constituent is split into more than one part because of the insertion of an element which is not part of the constituent. This is observed in many languages across the world such as Turkish, Russian, Japanese, Warlpiri, Navajo, Hopi, Dyirbal, Yidiny etc. Different formalisms/frameworks in current linguistic theory approach the problem of discontinuous structures in different ways. Each framework/formalism has widely been viewed as an independent and non-converging system of analysis. In this paper, we propose a unified system of representation for both continuity and discontinuity in structures of natural languages by taking into account three formalisms, in particular, Phrase Structure Grammar (PSG) for its widely used notion of constituency, Dependency Grammar (DG) for its head-dependent relations, and Categorial Grammar (CG) for its focus on functor-argument relations. We attempt to show that discontinuous expressions as well as continuous structures can be analysed through a unified mathematical derivation incorporating the representations of linguistic structure in these three grammar formalisms.