🤖 AI Summary
Large language models (LLMs) leverage chain-of-thought (CoT) reasoning to enhance complex problem-solving, yet standard CoT incurs substantial token consumption and latency—especially in zero-shot settings. Existing optimization approaches predominantly rely on model fine-tuning, compromising zero-shot capability and deployment flexibility. To address this, we propose Focused Chain-of-Thought (F-CoT), a training-free, input-only method that improves inference efficiency by decoupling information extraction from reasoning—inspired by cognitive psychology. F-CoT first constructs a structured query context from the input, then guides the LLM to perform focused, zero-shot reasoning over this distilled representation. To our knowledge, this is the first work to systematically optimize LLM inference paths through input reconstruction alone. Evaluated on arithmetic word problems, F-CoT reduces token usage by 2–3× compared to standard zero-shot CoT while preserving comparable accuracy—thereby significantly improving both inference efficiency and practical deployability.
📝 Abstract
Recent large language models achieve strong reasoning performance by generating detailed chain-of-thought traces, but this often leads to excessive token use and high inference latency. Existing efficiency approaches typically focus on model-centric interventions, such as reinforcement learning or supervised fine-tuning, to reduce verbosity. In contrast, we propose a training-free, input-centric approach. Inspired by cognitive psychology, we introduce Focused Chain-of-Thought (F-CoT), which separates information extraction from the reasoning process. F-CoT first organizes the essential information from a query into a concise, structured context and then guides the model to reason exclusively over this context. By preventing attention to irrelevant details, F-CoT naturally produces shorter reasoning paths. On arithmetic word problems, F-CoT reduces generated tokens by 2-3x while maintaining accuracy comparable to standard zero-shot CoT. These results highlight structured input as a simple yet effective lever for more efficient LLM reasoning.