Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

📅 2025-11-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) leverage chain-of-thought (CoT) reasoning to enhance complex problem-solving, yet standard CoT incurs substantial token consumption and latency—especially in zero-shot settings. Existing optimization approaches predominantly rely on model fine-tuning, compromising zero-shot capability and deployment flexibility. To address this, we propose Focused Chain-of-Thought (F-CoT), a training-free, input-only method that improves inference efficiency by decoupling information extraction from reasoning—inspired by cognitive psychology. F-CoT first constructs a structured query context from the input, then guides the LLM to perform focused, zero-shot reasoning over this distilled representation. To our knowledge, this is the first work to systematically optimize LLM inference paths through input reconstruction alone. Evaluated on arithmetic word problems, F-CoT reduces token usage by 2–3× compared to standard zero-shot CoT while preserving comparable accuracy—thereby significantly improving both inference efficiency and practical deployability.

Technology Category

Application Category

📝 Abstract
Recent large language models achieve strong reasoning performance by generating detailed chain-of-thought traces, but this often leads to excessive token use and high inference latency. Existing efficiency approaches typically focus on model-centric interventions, such as reinforcement learning or supervised fine-tuning, to reduce verbosity. In contrast, we propose a training-free, input-centric approach. Inspired by cognitive psychology, we introduce Focused Chain-of-Thought (F-CoT), which separates information extraction from the reasoning process. F-CoT first organizes the essential information from a query into a concise, structured context and then guides the model to reason exclusively over this context. By preventing attention to irrelevant details, F-CoT naturally produces shorter reasoning paths. On arithmetic word problems, F-CoT reduces generated tokens by 2-3x while maintaining accuracy comparable to standard zero-shot CoT. These results highlight structured input as a simple yet effective lever for more efficient LLM reasoning.
Problem

Research questions and friction points this paper is trying to address.

Reduces token usage and latency in LLM reasoning
Separates information extraction from reasoning process
Maintains accuracy while shortening reasoning paths
Innovation

Methods, ideas, or system contributions that make the work stand out.

Separates information extraction from reasoning process
Organizes query into concise structured context
Reduces tokens by 2-3x while maintaining accuracy
🔎 Similar Papers
No similar papers found.