Orion-Lite: Distilling LLM Reasoning into Efficient Vision-Only Driving Models

📅 2026-04-09

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenge of effectively transferring the complex reasoning capabilities of large language models (LLMs) to autonomous driving systems under stringent latency and energy efficiency constraints, particularly for handling rare and highly interactive driving scenarios. The authors propose Orion-Lite, a lightweight, vision-only student model that distills reasoning knowledge from the large vision–language–action (VLA) teacher model ORION through a combination of latent feature distillation and supervision from real-world trajectories. Notably, this is the first demonstration of LLM-based knowledge distillation in a complex closed-loop driving setting. Experimental results show that Orion-Lite not only significantly outperforms its teacher but also achieves a new state-of-the-art Driving Score of 80.6 on the Bench2Drive benchmark, highlighting the substantial potential of pure vision architectures for high-performance reactive planning in autonomous driving.

Technology Category

Application Category

📝 Abstract

Leveraging the general world knowledge of Large Language Models (LLMs) holds significant promise for improving the ability of autonomous driving systems to handle rare and complex scenarios. While integrating LLMs into Vision-Language-Action (VLA) models has yielded state-of-the-art performance, their massive parameter counts pose severe challenges for latency-sensitive and energy-efficient deployment. Distilling LLM knowledge into a compact driving model offers a compelling solution to retain these reasoning capabilities while maintaining a manageable computational footprint. Although previous works have demonstrated the efficacy of distillation, these efforts have primarily focused on relatively simple scenarios and open-loop evaluations. Therefore, in this work, we investigate LLM distillation in more complex, interactive scenarios under closed-loop evaluation. We demonstrate that through a combination of latent feature distillation and ground-truth trajectory supervision, an efficient vision-only student model \textbf{Orion-Lite} can even surpass the performance of its massive VLA teacher, ORION. Setting a new state-of-the-art on the rigorous Bench2Drive benchmark, with a Driving Score of 80.6. Ultimately, this reveals that vision-only architectures still possess significant, untapped potential for high-performance reactive planning.

Problem

Research questions and friction points this paper is trying to address.

LLM distillation

autonomous driving

vision-only models

closed-loop evaluation

complex driving scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM distillation

vision-only driving

closed-loop evaluation