ThermalGuardian: Temperature-Aware Testing of Automotive Deep Learning Frameworks

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Modern automotive deep learning frameworks exhibit temperature-sensitive failures—such as GPU thermal throttling, computation latency, high/mixed-precision errors, and time-series synchronization anomalies—under extreme ambient temperatures (−40°C to 50°C). Existing testing methodologies neglect thermal effects and thus fail to detect such defects. Method: We propose the first temperature-aware testing framework for deep learning systems, modeling GPU thermal dynamics via Newton’s law of cooling, integrating real-time temperature–driven frequency scaling, and generating test cases through operator-level model mutation rules tailored to thermal sensitivity. Contribution/Results: Evaluated on mainstream automotive AI frameworks, our approach successfully uncovers novel thermal-induced defects—including latency spikes, accuracy degradation, and synchronization failures—demonstrating its effectiveness in exposing environment-dependent vulnerabilities. This work bridges a critical gap in AI framework quality assurance by explicitly incorporating environmental temperature as a first-class testing dimension.

Technology Category

Application Category

📝 Abstract

Deep learning models play a vital role in autonomous driving systems, supporting critical functions such as environmental perception. To accelerate model inference, these deep learning models' deployment relies on automotive deep learning frameworks, for example, PaddleInference in Apollo and TensorRT in AutoWare. However, unlike deploying deep learning models on the cloud, vehicular environments experience extreme ambient temperatures varying from -40°C to 50°C, significantly impacting GPU temperature. Additionally, heats generated when computing further lead to the GPU temperature increase. These temperature fluctuations lead to dynamic GPU frequency adjustments through mechanisms such as DVFS. However, automotive deep learning frameworks are designed without considering the impact of temperature-induced frequency variations. When deployed on temperature-varying GPUs, these frameworks suffer critical quality issues: compute-intensive operators face delays or errors, high/mixed-precision operators suffer from precision errors, and time-series operators suffer from synchronization issues. The above quality issues cannot be detected by existing deep learning framework testing methods because they ignore temperature's effect on the deep learning framework quality. To bridge this gap, we propose ThermalGuardian, the first automotive deep learning framework testing method under temperature-varying environments. Specifically, ThermalGuardian generates test input models using model mutation rules targeting temperature-sensitive operators, simulates GPU temperature fluctuations based on Newton's law of cooling, and controls GPU frequency based on real-time GPU temperature.

Problem

Research questions and friction points this paper is trying to address.

Testing automotive deep learning frameworks under temperature variations

Detecting temperature-induced GPU frequency impact on framework quality

Addressing precision errors and synchronization issues in vehicular environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model mutation rules for temperature-sensitive operators

Simulates GPU temperature fluctuations via Newton's cooling

Controls GPU frequency based on real-time temperature

🔎 Similar Papers

Deep Learning Library Testing: Definition, Methods and Challenges

2024-04-27Citations: 1

Bosch Group

Reutlingen, BW, DE

Mandatory Internship AI-Based Arc Detection for Automotive Electrical Systems

Bosch Group

Stuttgart, BW, DE

Authors to Follow