🤖 AI Summary
This work addresses fine-grained, sentence-level document style change point detection—specifically identifying subtle stylistic shifts between adjacent sentences, especially in challenging cases involving short sentences or minimal stylistic divergence. We propose a lightweight, efficient sequence-based sentence-pair classification architecture: individual sentences are encoded via a pretrained language model; bidirectional LSTMs capture local contextual dependencies; and concatenated sentence representations are fed into an MLP to predict style discontinuities. This design notably enhances discriminability for “shallow-style” segments. Evaluated on the PAN-2025 official test set, our model achieves macro-F1 scores of 0.923, 0.828, and 0.724 across three task variants—substantially outperforming both random baselines and zero-shot Claude-3.5-Sonnet. Results validate the effectiveness and state-of-the-art capability of context-aware sentence-pair modeling for style change detection.
📝 Abstract
Style change detection - identifying the points in a document where writing style shifts - remains one of the most important and challenging problems in computational authorship analysis. At PAN 2025, the shared task challenges participants to detect style switches at the most fine-grained level: individual sentences. The task spans three datasets, each designed with controlled and increasing thematic variety within documents. We propose to address this problem by modeling the content of each problem instance - that is, a series of sentences - as a whole, using a Sequential Sentence Pair Classifier (SSPC). The architecture leverages a pre-trained language model (PLM) to obtain representations of individual sentences, which are then fed into a bidirectional LSTM (BiLSTM) to contextualize them within the document. The BiLSTM-produced vectors of adjacent sentences are concatenated and passed to a multi-layer perceptron for prediction per adjacency. Building on the work of previous PAN participants classical text segmentation, the approach is relatively conservative and lightweight. Nevertheless, it proves effective in leveraging contextual information and addressing what is arguably the most challenging aspect of this year's shared task: the notorious problem of "stylistically shallow", short sentences that are prevalent in the proposed benchmark data. Evaluated on the official PAN-2025 test datasets, the model achieves strong macro-F1 scores of 0.923, 0.828, and 0.724 on the EASY, MEDIUM, and HARD data, respectively, outperforming not only the official random baselines but also a much more challenging one: claude-3.7-sonnet's zero-shot performance.