🤖 AI Summary
Large language models (LLMs) often generate chain-of-thought (CoT) reasoning that is verbose, unstructured, and lacks mechanisms for user intervention, compromising auditability and controllability. Method: We propose a hierarchical topic modeling approach that explicitly organizes CoT into an interactive, tree-structured representation, coupled with a visualization interface supporting real-time editing, backtracking, and feedback—enabling the first dynamic user intervention in the reasoning process. Integrated into the AI-assisted decision-support system Hippo, our method was evaluated via a user study. Results: Participants rapidly identified flawed reasoning steps, guided model revisions of assumptions, and prompted supplementary perspectives—significantly improving reasoning transparency, response personalization, and understanding of model behavior. Our core contribution is the first CoT interaction paradigm supporting structured editing and closed-loop human-in-the-loop feedback.
📝 Abstract
The output quality of large language models (LLMs) can be improved via "reasoning": generating segments of chain-of-thought (CoT) content to further condition the model prior to producing user-facing output. While these chains contain valuable information, they are verbose and lack explicit organization, making them tedious to review. Moreover, they lack opportunities for user feedback, such as to remove unwanted considerations, add desired ones, or clarify unclear assumptions. We introduce Interactive Reasoning, an interaction design that visualizes chain-of-thought outputs as a hierarchy of topics and enables user review and modification. We implement interactive reasoning in Hippo, a prototype for AI-assisted decision making in the face of uncertain trade-offs. In a user study with 16 participants, we find that interactive reasoning in Hippo allows users to quickly identify and interrupt erroneous generations, efficiently steer the model towards customized responses, and better understand both model reasoning and model outputs. Our work contributes to a new paradigm that incorporates user oversight into LLM reasoning processes.