A Poisson Factor Mixture Model for the Analysis of Linguistic Competence in Italian University Students'Writing

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether Italian university students’ formal written language competence exhibits systematic structural patterns and group heterogeneity, rather than merely reflecting a generalized linguistic “decline.” Leveraging a nationally representative corpus from the UniversITA writing assessment, the research pioneers an integrated approach combining Poisson factor analysis with mixture modeling to capture multidimensional linguistic features spanning orthography, lexicon, and syntax. This method effectively handles low-frequency count data while uncovering dependencies among features and latent subpopulations. The analysis reveals two correlated dimensions—communicative and grammatical competence—which, in conjunction with educational and demographic variables, enable nuanced profiling of students’ language abilities across disciplinary backgrounds. These findings offer empirical grounding for evidence-based language instruction and policy design in higher education.

Technology Category

Application Category

📝 Abstract
Public debate on the alleged decline of language skills among younger generations often focuses on university students, the most highly educated segment of the population. Rather than addressing the ill posed question of linguistic decline, this paper examines how formal written Italian is currently used by university students and whether systematic patterns of competence and heterogeneity can be identified. The analysis is based on data from the UniversITA project, which collected formal texts written by a large and nationally representative sample of Italian university students. Texts were annotated for linguistically motivated features covering orthography, lexicon, syntax, morphosyntax, coherence, register, and sentence structure, yielding low frequency multivariate count data. To analyse these data, we propose a novel model-based clustering approach based on a Poisson factor mixture model that accounts for dependence among linguistic features and unobserved population heterogeneity. The results identify two correlated dimensions of writing competence, interpretable as communicative competence and linguistic grammatical competence. When educational and socio demographic information is incorporated, distinct student profiles emerge that are associated with field of study and educational background. These findings provide quantitative evidence on contemporary writing and offer insights relevant for language education and higher education policy.
Problem

Research questions and friction points this paper is trying to address.

linguistic competence
university students
writing analysis
population heterogeneity
formal written Italian
Innovation

Methods, ideas, or system contributions that make the work stand out.

Poisson factor mixture model
model-based clustering
linguistic competence
multivariate count data
writing assessment
🔎 Similar Papers
No similar papers found.
S
Silvia Dallari
Department of Statistical Sciences, University of Bologna
L
Laura Anderlucci
Department of Statistical Sciences, University of Bologna
Nicola Grandi
Nicola Grandi
Alma Mater Studiorum - Università di Bologna
Linguistics
A
Angela Montanari
Department of Statistical Sciences, University of Bologna