🤖 AI Summary
This paper addresses the challenge of calibrating high-dimensional contingency tables in survey inference and global health modeling. We propose a unified dual convex optimization framework that simultaneously handles marginal consistency, heterogeneous weighting, missing data, soft/hard constraints, and uncertainty propagation. Our method employs a matrix-free fast dual algorithm, enabling—for the first time—the integrated implementation of n-dimensional raking, bounded estimation, and noisy marginal modeling. An open-source Python package provides a streamlined API for efficient, large-scale, multi-granularity calibration. Evaluated on synthetic data and real-world mortality estimation tasks, the approach achieves significant improvements in estimation accuracy, robustness, and computational efficiency. It effectively harmonizes outputs from multiple heterogeneous models, advancing standardized and scalable practices for complex weighted adjustment problems.
📝 Abstract
Raking is widely used in survey inference and global health models to adjust the observations in contingency tables to given marginals, in the latter case reconciling estimates between models with different granularities. We review the convex optimization foundation of raking and focus on a dual perspective that simplifies and streamlines prior raking extensions and provides new functionality, enabling a unified approach to n-dimensional raking, raking with differential weights, ensuring bounds on estimates are respected, raking to margins either as hard constraints or as aggregate observations, handling missing data, and allowing efficient uncertainty propagation. The dual perspective also enables a uniform fast and scalable matrix-free optimization approach for all of these extensions. All of the methods are implemented in an open source Python package with an intuitive user interface, installable from PyPi (https://pypi.org/project/raking/), and we illustrate the capabilities using synthetic data and real mortality estimates.