Accelerating Structured Chain-of-Thought in Autonomous Vehicles

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Chain-of-thought (CoT) reasoning in autonomous driving suffers from significant latency due to sequential generation, hindering real-time performance. This work proposes FastDriveCoT, the first approach to enable parallel decoding of templated CoT by modeling subtask dependencies as a directed graph, thereby decomposing structured reasoning into independent steps that can be executed concurrently. By synchronously generating multi-step reasoning within a single forward pass, FastDriveCoT integrates dependency graph modeling, parallel decoding, and a unified vision–language–action architecture. Evaluated across multiple model architectures, the method achieves a 3–4× speedup in CoT generation, substantially reducing end-to-end latency while maintaining or even improving downstream task performance.

Technology Category

Application Category

📝 Abstract

Chain-of-Thought (CoT) reasoning enhances the decision-making capabilities of vision-language-action models in autonomous driving, but its autoregressive nature introduces significant inference latency, making it impractical for real-time applications. To address this, we introduce FastDriveCoT, a novel parallel decoding method that accelerates template-structured CoT. Our approach decomposes the reasoning process into a dependency graph of distinct sub-tasks, such as identifying critical objects and summarizing traffic rules, some of which can be generated in parallel. By generating multiple independent reasoning steps concurrently within a single forward pass, we significantly reduce the number of sequential computations. Experiments demonstrate a 3-4$\times$ speedup in CoT generation and a substantial reduction in end-to-end latency across various model architectures, all while preserving the original downstream task improvements brought by incorporating CoT reasoning.

Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought

autonomous driving

inference latency

real-time applications

vision-language-action models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-Thought

parallel decoding

autonomous driving