Beyond Interpretability: Exploring the Comprehensibility of Adaptive Video Streaming through Large Language Models

📅 2025-08-22

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Deep learning–based bitrate adaptation in adaptive video streaming operates as a “black box,” hindering developer understanding and optimization. Method: We propose the first algorithmic framework jointly optimizing performance and developer comprehensibility: (1) leveraging deep reinforcement learning to generate high-performing ensembles of decision trees; (2) introducing large language models (LLMs) to quantitatively assess the human comprehensibility of each tree—thereby modeling subjective interpretability; and (3) jointly optimizing both streaming performance and LLM-derived comprehensibility scores to select the optimal policy. Results: Experiments demonstrate that our method achieves state-of-the-art streaming quality (e.g., QoE, rebuffering ratio) while significantly enhancing developers’ depth of understanding and debuggability of decision logic—overcoming a key limitation of prior explainable AI research, which emphasizes model transparency without accounting for human cognitive alignment.

Technology Category

Application Category

📝 Abstract

Over the past decade, adaptive video streaming technology has witnessed significant advancements, particularly driven by the rapid evolution of deep learning techniques. However, the black-box nature of deep learning algorithms presents challenges for developers in understanding decision-making processes and optimizing for specific application scenarios. Although existing research has enhanced algorithm interpretability through decision tree conversion, interpretability does not directly equate to developers' subjective comprehensibility. To address this challenge, we introduce exttt{ComTree}, the first bitrate adaptation algorithm generation framework that considers comprehensibility. The framework initially generates the complete set of decision trees that meet performance requirements, then leverages large language models to evaluate these trees for developer comprehensibility, ultimately selecting solutions that best facilitate human understanding and enhancement. Experimental results demonstrate that exttt{ComTree} significantly improves comprehensibility while maintaining competitive performance, showing potential for further advancement. The source code is available at https://github.com/thu-media/ComTree.

Problem

Research questions and friction points this paper is trying to address.

Improving comprehensibility of adaptive video streaming algorithms

Addressing black-box nature of deep learning decision processes

Generating human-understandable bitrate adaptation solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates decision trees meeting performance requirements

Uses large language models to evaluate comprehensibility

Selects solutions optimizing human understanding and enhancement

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs