Optimization perspective on raking

📅 2024-07-30

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This paper addresses the challenge of calibrating high-dimensional contingency tables in survey inference and global health modeling. We propose a unified dual convex optimization framework that simultaneously handles marginal consistency, heterogeneous weighting, missing data, soft/hard constraints, and uncertainty propagation. Our method employs a matrix-free fast dual algorithm, enabling—for the first time—the integrated implementation of n-dimensional raking, bounded estimation, and noisy marginal modeling. An open-source Python package provides a streamlined API for efficient, large-scale, multi-granularity calibration. Evaluated on synthetic data and real-world mortality estimation tasks, the approach achieves significant improvements in estimation accuracy, robustness, and computational efficiency. It effectively harmonizes outputs from multiple heterogeneous models, advancing standardized and scalable practices for complex weighted adjustment problems.

Technology Category

Application Category

📝 Abstract

Raking is widely used in survey inference and global health models to adjust the observations in contingency tables to given marginals, in the latter case reconciling estimates between models with different granularities. We review the convex optimization foundation of raking and focus on a dual perspective that simplifies and streamlines prior raking extensions and provides new functionality, enabling a unified approach to n-dimensional raking, raking with differential weights, ensuring bounds on estimates are respected, raking to margins either as hard constraints or as aggregate observations, handling missing data, and allowing efficient uncertainty propagation. The dual perspective also enables a uniform fast and scalable matrix-free optimization approach for all of these extensions. All of the methods are implemented in an open source Python package with an intuitive user interface, installable from PyPi (https://pypi.org/project/raking/), and we illustrate the capabilities using synthetic data and real mortality estimates.

Problem

Research questions and friction points this paper is trying to address.

Adjusting contingency tables to given marginals in surveys and health models

Providing a unified approach to n-dimensional raking with new functionalities

Enabling fast, scalable optimization for raking extensions and uncertainty propagation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual perspective simplifies raking extensions

Matrix-free optimization for scalable processing

Open source Python package with user interface

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique