π€ AI Summary
This work addresses the challenge of accurately summarizing deeply nested discussions, where interleaved replies, quotations, and overlapping topics hinder existing large language modelβbased summarization approaches. To overcome this, the authors propose a hierarchical thread-aware summarization method that leverages explicit discourse structure to guide content planning. The approach first extracts atomic content units and constructs multi-perspective, thread-aware sequences by integrating sentence ordering. It then employs Tree-of-Thoughts search to generate and score multiple paragraph candidates, jointly optimizing coherence and coverage. Experimental results demonstrate that the proposed method significantly outperforms current baselines in preserving logical structure, topic coverage, and salient viewpoints, thereby substantially improving the quality of summaries for nested discussions.
π Abstract
Summarizing deeply nested discussion threads requires handling interleaved replies, quotes, and overlapping topics, which standard LLM summarizers struggle to capture reliably. We introduce ThreadSumm, a multi-stage LLM framework that treats thread summarization as a hierarchical reasoning problem over explicit aspect and content unit representations. Our method first performs content planning via LLM-based extraction of discourse aspects and Atomic Content Units, then applies sentence ordering to construct thread-aware sequences that surface multiple viewpoints rather than a single linear strand. On top of these interpretable units, ThreadSumm employs a Tree of Thoughts search that generates and scores multiple paragraph candidates, jointly optimizing coherence and coverage within a unified search space. With this multi-proposal and iterative refinement design, we show improved performance in generating logically structured summaries compared to existing baselines, while achieving higher aspect retention and opinion coverage in nested discussions.