Evolving Voices Based on Temporal Poisson Factorisation

📅 2024-10-24

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Modeling fine-grained temporal evolution of topics in long-span political speech corpora (U.S. Senate proceedings across 18 sessions, 1981–2016) remains challenging due to sparse, high-dimensional, and non-stationary text streams. Method: We propose Temporal Poisson Factorization (TPF), the first Poisson factorization framework integrating first-order autoregressive dynamics and stochastic walk priors over topic proportions and word distributions. We design a variational inference scheme with tailored multivariate/univariate approximating families and optimize via coordinate ascent combined with automatic differentiation, enabling efficient batch processing and scalable decomposition of sparse term-frequency matrices. Results: Evaluated on over three decades of real legislative speech data, TPF robustly captures stable policy-topic trajectories and keyword migration patterns. It significantly outperforms static LDA and naive temporal baselines in both topic coherence and temporal predictive accuracy.

Technology Category

Application Category

📝 Abstract

The world is evolving and so is the vocabulary used to discuss topics in speech. Analysing political speech data from more than 30 years requires the use of flexible topic models to uncover the latent topics and their change in prevalence over time as well as the change in the vocabulary of the topics. We propose the temporal Poisson factorisation (TPF) model as an extension to the Poisson factorisation model to model sparse count data matrices obtained based on the bag-of-words assumption from text documents with time stamps. We discuss and empirically compare different model specifications for the time-varying latent variables consisting either of a flexible auto-regressive structure of order one or a random walk. Estimation is based on variational inference where we consider a combination of coordinate ascent updates with automatic differentiation using batching of documents. Suitable variational families are proposed to ease inference. We compare results obtained using independent univariate variational distributions for the time-varying latent variables to those obtained with a multivariate variant. We discuss in detail the results of the TPF model when analysing speeches from 18 sessions in the U.S. Senate (1981-2016).

Problem

Research questions and friction points this paper is trying to address.

Modeling evolving vocabulary in political speeches over time

Extending Poisson factorization for time-stamped text data

Analyzing topic prevalence and vocabulary changes in US Senate speeches

Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Poisson factorisation model extension

Variational inference with coordinate ascent

Auto-regressive and random walk specifications

🔎 Similar Papers

No similar papers found.