🤖 AI Summary
Causal inference confronts practical challenges including high-dimensional covariates, individual noncompliance, and network interference; conventional model-based approaches often rely on strong parametric or functional-form assumptions. This paper systematically reviews design-driven advances in causal inference, focusing on covariate-balancing designs (e.g., stratified randomization, rerandomization), regression adjustment, and network-aware inference—grounded in the Fisher randomization test and Neyman asymptotic frameworks. Its key contribution lies in developing a robust, model-agnostic framework for randomization design and inference under weak assumptions, substantially enhancing identification power and statistical efficiency in high-dimensional, noncompliant, and network-interfered settings. The proposed methodology provides theoretically rigorous yet practically implementable tools for causal analysis in medicine, social sciences, and related domains. It further identifies promising future directions, including multi-scenario integration and co-design of algorithms and experimental protocols.
📝 Abstract
Causal inference, as a major research area in statistics and data science, plays a central role across diverse fields such as medicine, economics, education, and the social sciences. Design-based causal inference begins with randomized experiments and emphasizes conducting statistical inference by leveraging the known randomization mechanism, thereby enabling identification and estimation of causal effects under weak model dependence. Grounded in the seminal works of Fisher and Neyman, this paradigm has evolved to include various design strategies, such as stratified randomization and rerandomization, and analytical methods including Fisher randomization tests, Neyman-style asymptotic inference, and regression adjustment. In recent years, with the emergence of complex settings involving high-dimensional data, individual noncompliance, and network interference, design-based causal inference has witnessed remarkable theoretical and methodological advances. This paper provides a systematic review of recent progress in this field, focusing on covariate-balanced randomization designs, design-based statistical inference methods, and their extensions to high-dimensional, noncompliance, and network interference scenarios. It concludes with a comprehensive perspective on future directions for the theoretical development and practical applications of causal inference.