🤖 AI Summary
This work addresses the challenge that large language models often struggle to balance structural coherence and narrative diversity when generating long-form Chinese novels on open-ended topics. The authors propose a “climax-first, bidirectional expansion” strategy: starting from a given theme, the model first extracts the core conflict and generates a well-defined climax, then leverages Freytag’s pyramid theory to expand the plot both forward and backward via bidirectional Monte Carlo Tree Search (MCTS). The approach integrates theme parsing, climax generation, bidirectional MCTS, fine-tuning of large language models, and a structured pipeline from outline to full narrative. Experiments on a newly curated Chinese thematic corpus demonstrate that the method significantly outperforms strong baselines in narrative coherence, plot structure, and thematic depth, with consistent improvements validated by both automatic metrics and human evaluation, enabling the generation of longer and more coherent stories.
📝 Abstract
Generating long-form linear fiction from open-ended themes remains a major challenge for large language models, which frequently fail to guarantee global structure and narrative diversity when using premise-based or linear outlining approaches. We present BiT-MCTS, a theme-driven framework that operationalizes a "climax-first, bidirectional expansion" strategy motivated by Freytag's Pyramid. Given a theme, our method extracts a core dramatic conflict and generates an explicit climax, then employs a bidirectional Monte Carlo Tree Search (MCTS) to expand the plot backward (rising action, exposition) and forward (falling action, resolution) to produce a structured outline. A final generation stage realizes a complete narrative from the refined outline. We construct a Chinese theme corpus for evaluation and conduct extensive experiments across three contemporary LLM backbones. Results show that BiT-MCTS improves narrative coherence, plot structure, and thematic depth relative to strong baselines, while enabling substantially longer, more coherent stories according to automatic metrics and human judgments.