🤖 AI Summary
This paper challenges the widely held assumption that hosting data centers locally guarantees digital sovereignty, examining how the nationality of data center operators affects their subjection to foreign legal jurisdiction. Method: We construct a multidimensional dataset comprising 775 non-U.S. data centers—spanning over 20 variables and 1,000 cited sources—employing investment-value-weighted statistical analysis, public information mining, and qualitative textual analysis. Contribution/Results: Weighted by investment volume, 48% of non-U.S. data centers are operated by U.S.-based firms, revealing persistent dependence of global compute infrastructure on American entities. The paper introduces “data operator” as a pivotal lever in international AI governance, demonstrating that operational control—not merely physical location—fundamentally shapes digital sovereignty. To foster transparency and reproducibility, we publicly release the dataset to support future empirical research on transnational data infrastructure governance.
📝 Abstract
Previous literature has proposed that the companies operating data centers enforce government regulations on AI companies. Using a new dataset of 775 non-U.S. data center projects, this paper estimates how often data centers could be subject to foreign legal authorities due to the nationality of the data center operators. We find that U.S. companies operate 48% of all non-U.S. data center projects in our dataset when weighted by investment value - a proxy for compute capacity. This is an approximation based on public data and should be interpreted as an initial estimate. For the United States, our findings suggest that data center operators offer a lever for internationally governing AI that complements traditional export controls, since operators can be used to regulate computing resources already deployed in non-U.S. data centers. For other countries, our results show that building data centers locally does not guarantee digital sovereignty if those facilities are run by foreign entities.
To support future research, we release our dataset, which documents over 20 variables relating to each data center, including the year it was announced, the investment value, and its operator's national affiliation. The dataset also includes over 1,000 quotes describing these data centers' strategic motivations, operational challenges, and engagement with U.S. and Chinese entities.