Persistent Homology of Topic Networks for the Prediction of Reader Curiosity

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This work addresses the challenge of quantifying information gaps within textual semantic structures to predict reader curiosity. Method: Grounded in information gap theory, we propose a novel geometric deep learning paradigm: (1) dynamically constructing topic networks using BERTopic; (2) applying persistent homology (via Ripser/GUDHI) to extract topological features—such as connected components, cycles, and voids—as interpretable proxies for information gaps; and (3) employing generalized additive models (GAMs) for curiosity prediction. Contribution/Results: Evaluated on a reading experiment with *The Hunger Games* (n = 49), our topological features explain 73% of the variance in curiosity ratings—substantially outperforming baseline models (30%). This demonstrates that dynamic topological structure in semantic networks provides both strong representational power and interpretability for modeling cognitive engagement.

Technology Category

Application Category

📝 Abstract

Reader curiosity, the drive to seek information, is crucial for textual engagement, yet remains relatively underexplored in NLP. Building on Loewenstein's Information Gap Theory, we introduce a framework that models reader curiosity by quantifying semantic information gaps within a text's semantic structure. Our approach leverages BERTopic-inspired topic modeling and persistent homology to analyze the evolving topology (connected components, cycles, voids) of a dynamic semantic network derived from text segments, treating these features as proxies for information gaps. To empirically evaluate this pipeline, we collect reader curiosity ratings from participants (n = 49) as they read S. Collins's ''The Hunger Games'' novel. We then use the topological features from our pipeline as independent variables to predict these ratings, and experimentally show that they significantly improve curiosity prediction compared to a baseline model (73% vs. 30% explained deviance), validating our approach. This pipeline offers a new computational method for analyzing text structure and its relation to reader engagement.

Problem

Research questions and friction points this paper is trying to address.

Predict reader curiosity using semantic information gaps

Model text structure via topic networks and topology

Improve curiosity prediction with topological features

Innovation

Methods, ideas, or system contributions that make the work stand out.

BERTopic-inspired topic modeling for semantic analysis

Persistent homology for dynamic semantic network topology

Topological features predict reader curiosity effectively

🔎 Similar Papers

GINopic: Topic Modeling with Graph Isomorphism Network