Accelerating Structured Chain-of-Thought in Autonomous Vehicles

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Chain-of-thought (CoT) reasoning in autonomous driving suffers from significant latency due to sequential generation, hindering real-time performance. This work proposes FastDriveCoT, the first approach to enable parallel decoding of templated CoT by modeling subtask dependencies as a directed graph, thereby decomposing structured reasoning into independent steps that can be executed concurrently. By synchronously generating multi-step reasoning within a single forward pass, FastDriveCoT integrates dependency graph modeling, parallel decoding, and a unified vision–language–action architecture. Evaluated across multiple model architectures, the method achieves a 3–4× speedup in CoT generation, substantially reducing end-to-end latency while maintaining or even improving downstream task performance.

Technology Category

Application Category

📝 Abstract
Chain-of-Thought (CoT) reasoning enhances the decision-making capabilities of vision-language-action models in autonomous driving, but its autoregressive nature introduces significant inference latency, making it impractical for real-time applications. To address this, we introduce FastDriveCoT, a novel parallel decoding method that accelerates template-structured CoT. Our approach decomposes the reasoning process into a dependency graph of distinct sub-tasks, such as identifying critical objects and summarizing traffic rules, some of which can be generated in parallel. By generating multiple independent reasoning steps concurrently within a single forward pass, we significantly reduce the number of sequential computations. Experiments demonstrate a 3-4$\times$ speedup in CoT generation and a substantial reduction in end-to-end latency across various model architectures, all while preserving the original downstream task improvements brought by incorporating CoT reasoning.
Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought
autonomous driving
inference latency
real-time applications
vision-language-action models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-Thought
parallel decoding
autonomous driving
dependency graph
inference acceleration
🔎 Similar Papers
No similar papers found.