🤖 AI Summary
This paper addresses engineering challenges in migrating large-scale cloud warehouse workloads from x86 to Arm instruction set architectures (ISAs), proposing a source-code recompilation–centric paradigm—distinct from binary translation. Drawing on nearly 40,000 real-world code commits at Google, we establish the first systematic task taxonomy for large-scale ISA migration. Our approach integrates static analysis, automated code refactoring, machine learning–assisted modifications, and CI pipeline monitoring to drive open-source ecosystem–based, full-stack software reconstruction. The methodology has been deployed internally at Google to automate x86-to-Arm migration across production systems, significantly improving efficiency while surfacing critical legacy bottlenecks. Key contributions include: (1) formalizing a recompilation-first framework for ISA migration; (2) introducing a principled, empirically grounded task classification system; and (3) empirically validating AI’s pivotal role in migration automation—providing an industry-reusable blueprint and opening new research directions in ISA migration for academia.
📝 Abstract
Migrating codebases from one instruction set architecture (ISA) to another is a major engineering challenge. A recent example is the adoption of Arm (in addition to x86) across the major Cloud hyperscalers. Yet, this problem has seen limited attention by the academic community. Most work has focused on static and dynamic binary translation, and the traditional conventional wisdom has been that this is the primary challenge. In this paper, we show that this is no longer the case. Modern ISA migrations can often build on a robust open-source ecosystem, making it possible to recompile all relevant software from scratch. This introduces a new and multifaceted set of challenges, which are different from binary translation. By analyzing a large-scale migration from x86 to Arm at Google, spanning almost 40,000 code commits, we derive a taxonomy of tasks involved in ISA migration. We show how Google automated many of the steps involved, and demonstrate how AI can play a major role in automatically addressing these tasks. We identify tasks that remain challenging and highlight research challenges that warrant further attention.