Beyond Interpretability: Exploring the Comprehensibility of Adaptive Video Streaming through Large Language Models

📅 2025-08-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep learning–based bitrate adaptation in adaptive video streaming operates as a “black box,” hindering developer understanding and optimization. Method: We propose the first algorithmic framework jointly optimizing performance and developer comprehensibility: (1) leveraging deep reinforcement learning to generate high-performing ensembles of decision trees; (2) introducing large language models (LLMs) to quantitatively assess the human comprehensibility of each tree—thereby modeling subjective interpretability; and (3) jointly optimizing both streaming performance and LLM-derived comprehensibility scores to select the optimal policy. Results: Experiments demonstrate that our method achieves state-of-the-art streaming quality (e.g., QoE, rebuffering ratio) while significantly enhancing developers’ depth of understanding and debuggability of decision logic—overcoming a key limitation of prior explainable AI research, which emphasizes model transparency without accounting for human cognitive alignment.

Technology Category

Application Category

📝 Abstract
Over the past decade, adaptive video streaming technology has witnessed significant advancements, particularly driven by the rapid evolution of deep learning techniques. However, the black-box nature of deep learning algorithms presents challenges for developers in understanding decision-making processes and optimizing for specific application scenarios. Although existing research has enhanced algorithm interpretability through decision tree conversion, interpretability does not directly equate to developers' subjective comprehensibility. To address this challenge, we introduce exttt{ComTree}, the first bitrate adaptation algorithm generation framework that considers comprehensibility. The framework initially generates the complete set of decision trees that meet performance requirements, then leverages large language models to evaluate these trees for developer comprehensibility, ultimately selecting solutions that best facilitate human understanding and enhancement. Experimental results demonstrate that exttt{ComTree} significantly improves comprehensibility while maintaining competitive performance, showing potential for further advancement. The source code is available at https://github.com/thu-media/ComTree.
Problem

Research questions and friction points this paper is trying to address.

Improving comprehensibility of adaptive video streaming algorithms
Addressing black-box nature of deep learning decision processes
Generating human-understandable bitrate adaptation solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates decision trees meeting performance requirements
Uses large language models to evaluate comprehensibility
Selects solutions optimizing human understanding and enhancement
🔎 Similar Papers
2024-06-09Annual Meeting of the Association for Computational LinguisticsCitations: 13
2024-08-08International Journal of Computer VisionCitations: 13
L
Lianchen Jia
Department of Computer Science and Technology, Tsinghua University
C
Chaoyang Li
Department of Computer Science and Technology, Tsinghua University
Ziqi Yuan
Ziqi Yuan
Tsinghua University
Multimodal Machine LearningSocial AI
J
Jiahui Chen
Department of Computer Science and Technology, Tsinghua University
Tianchi Huang
Tianchi Huang
Sony
Adaptive Video StreamingReinforcement LearningCommunicatiion with ML
Jiangchuan Liu
Jiangchuan Liu
Professor, Simon Fraser University; Fellow of IEEE, Royal Society of Canada, Canadian Academy of Eng
Computer Science
L
Lifeng Sun
Department of Computer Science and Technology, Tsinghua University, Beijing National Research Center for Information Science and Technology