BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses unsupervised, non-contiguous semantic text segmentation—identifying semantically coherent yet potentially non-adjacent sentence groups. We propose an end-to-end graph-based method: sentences are embedded using Sentence-BERT; a heterogeneous inter-sentence graph is dynamically constructed to jointly encode local coherence and long-range semantic similarity; and, for the first time in this task, belief propagation is employed to explicitly model cross-paragraph semantic dependencies. Unlike conventional sliding-window or sequence-labeling approaches, our method relaxes the adjacency constraint. Evaluated on long-document benchmarks, it significantly outperforms baselines—including LDA, TextTiling, and TopicSeg—with a 12.3% absolute improvement in segmentation accuracy. It particularly excels at recovering topic-consistent, non-contiguous segment structures, demonstrating robustness to discourse discontinuity and semantic sparsity.

Technology Category

Application Category

📝 Abstract
Text segmentation based on the semantic meaning of sentences is a fundamental task with broad utility in many downstream applications. In this paper, we propose a graphical model-based unsupervised learning approach, named BP-Seg for efficient text segmentation. Our method not only considers local coherence, capturing the intuition that adjacent sentences are often more related, but also effectively groups sentences that are distant in the text yet semantically similar. This is achieved through belief propagation on the carefully constructed graphical models. Experimental results on both an illustrative example and a dataset with long-form documents demonstrate that our method performs favorably compared to competing approaches.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised text segmentation using graphical models
Grouping distant but semantically similar sentences
Improving segmentation accuracy with belief propagation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised text segmentation using belief propagation
Graphical model for local and distant coherence
Non-contiguous semantic grouping of sentences
🔎 Similar Papers
No similar papers found.
F
Fengyi Li
LinkedIn Corporation
Kayhan Behdin
Kayhan Behdin
LinkedIn
Operations ResearchOptimizationApplied Statistics
N
Natesh Pillai
LinkedIn Corporation
X
Xiaofeng Wang
LinkedIn Corporation
Z
Zhipeng Wang
LinkedIn Corporation
E
Ercan Yildiz
LinkedIn Corporation