Doctor: Optimizing Container Rebuild Efficiency by Instruction Re-Orchestration

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high rebuild overhead caused by frequent Dockerfile modifications, this paper proposes the first rebuild optimization method tailored for long-term evolution scenarios. Our approach mines historical modification patterns to construct an instruction priority model, then applies syntactic dependency analysis and weighted topological sorting to reorder instructions—thereby improving cache hit rates while preserving behavioral equivalence. We unify future change prediction, dependency-constraint modeling, and equivalence verification within a single optimization framework, and design a lightweight validation mechanism. Experimental evaluation on 2,000 GitHub repositories shows that our method reduces average rebuild time by 26.5%; 92.75% of Dockerfiles achieve speedup, with 12.82% exhibiting over 50% acceleration; functional consistency is maintained at 86.2%.

Technology Category

Application Category

📝 Abstract
Containerization has revolutionized software deployment, with Docker leading the way due to its ease of use and consistent runtime environment. As Docker usage grows, optimizing Dockerfile performance, particularly by reducing rebuild time, has become essential for maintaining efficient CI/CD pipelines. However, existing optimization approaches primarily address single builds without considering the recurring rebuild costs associated with modifications and evolution, limiting long-term efficiency gains. To bridge this gap, we present Doctor, a method for improving Dockerfile build efficiency through instruction re-ordering that addresses key challenges: identifying instruction dependencies, predicting future modifications, ensuring behavioral equivalence, and managing the optimization computational complexity. We developed a comprehensive dependency taxonomy based on Dockerfile syntax and a historical modification analysis to prioritize frequently modified instructions. Using a weighted topological sorting algorithm, Doctor optimizes instruction order to minimize future rebuild time while maintaining functionality. Experiments on 2,000 GitHub repositories show that Doctor improves 92.75% of Dockerfiles, reducing rebuild time by an average of 26.5%, with 12.82% of files achieving over a 50% reduction. Notably, 86.2% of cases preserve functional similarity. These findings highlight best practices for Dockerfile management, enabling developers to enhance Docker efficiency through informed optimization strategies.
Problem

Research questions and friction points this paper is trying to address.

Optimizing Dockerfile rebuild efficiency via instruction re-ordering
Reducing recurring rebuild costs in evolving containerized environments
Balancing optimization complexity with functional equivalence preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Instruction re-ordering for Dockerfile optimization
Dependency taxonomy and modification analysis
Weighted topological sorting algorithm
🔎 Similar Papers
No similar papers found.
Z
Zhiling Zhu
Zhejiang University of Technology, China
T
Tieming Chen
Zhejiang University of Technology, China
Chengwei Liu
Chengwei Liu
Research Assistant Professor, Nanyang Technological University
Open Source SecuritySoftware Supply Chain SecurityProgram AnalysisSoftware Maintenance
H
Han Liu
The Hong Kong University of Science and Technology, China
Q
Qijie Song
Zhejiang University of Technology, China
Zhengzi Xu
Zhengzi Xu
Senior Research Fellow, Imperial College London
Software EngineeringCyber SecurityLLMAI Trading
Y
Yang Liu
Nanyang Technological University, Singapore