Optimization perspective on raking

📅 2024-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of calibrating high-dimensional contingency tables in survey inference and global health modeling. We propose a unified dual convex optimization framework that simultaneously handles marginal consistency, heterogeneous weighting, missing data, soft/hard constraints, and uncertainty propagation. Our method employs a matrix-free fast dual algorithm, enabling—for the first time—the integrated implementation of n-dimensional raking, bounded estimation, and noisy marginal modeling. An open-source Python package provides a streamlined API for efficient, large-scale, multi-granularity calibration. Evaluated on synthetic data and real-world mortality estimation tasks, the approach achieves significant improvements in estimation accuracy, robustness, and computational efficiency. It effectively harmonizes outputs from multiple heterogeneous models, advancing standardized and scalable practices for complex weighted adjustment problems.

Technology Category

Application Category

📝 Abstract
Raking is widely used in survey inference and global health models to adjust the observations in contingency tables to given marginals, in the latter case reconciling estimates between models with different granularities. We review the convex optimization foundation of raking and focus on a dual perspective that simplifies and streamlines prior raking extensions and provides new functionality, enabling a unified approach to n-dimensional raking, raking with differential weights, ensuring bounds on estimates are respected, raking to margins either as hard constraints or as aggregate observations, handling missing data, and allowing efficient uncertainty propagation. The dual perspective also enables a uniform fast and scalable matrix-free optimization approach for all of these extensions. All of the methods are implemented in an open source Python package with an intuitive user interface, installable from PyPi (https://pypi.org/project/raking/), and we illustrate the capabilities using synthetic data and real mortality estimates.
Problem

Research questions and friction points this paper is trying to address.

Adjusting contingency tables to given marginals in surveys and health models
Providing a unified approach to n-dimensional raking with new functionalities
Enabling fast, scalable optimization for raking extensions and uncertainty propagation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual perspective simplifies raking extensions
Matrix-free optimization for scalable processing
Open source Python package with user interface
A
Ariane Ducellier
Institute for Health Metrics and Evaluation, University of Washington
A
Alexander Hsu
Institute for Health Metrics and Evaluation, University of Washington, Department of Applied Mathematics, University of Washington
P
Parkes Kendrick
Institute for Health Metrics and Evaluation, University of Washington
B
Bill Gustafson
Institute for Health Metrics and Evaluation, University of Washington
L
L. Dwyer-Lindgren
Institute for Health Metrics and Evaluation, University of Washington
C
Christopher Murray
Institute for Health Metrics and Evaluation, University of Washington
P
Peng Zheng
Institute for Health Metrics and Evaluation, University of Washington
Aleksandr Aravkin
Aleksandr Aravkin
University of Washington
Optimizationstatisticsinverse problemsconvex/variational analysisalgorithm design and implementation.