Fitting sparse high-dimensional varying-coefficient models with Bayesian regression tree ensembles

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

In high-dimensional varying-coefficient models (VCMs) where the number of covariates and effect modifiers exceeds the sample size, identifying nonzero effects and driving factors becomes challenging. To address this, we propose sparseVCBART—a Bayesian ensemble method using regression trees to model coefficient functions. It incorporates a global-local shrinkage prior and a tree-structured hierarchical splitting probability prior, enabling automatic sparsity selection and structural adaptivity. Theoretically, the posterior contraction rate achieves near-minimax optimality. Methodologically, sparseVCBART retains strong interpretability while accommodating flexible nonlinear relationships; its predictive accuracy matches state-of-the-art approaches. Crucially, it substantially improves uncertainty quantification for null-effect covariates, yielding narrower and better-calibrated confidence intervals.

Technology Category

Application Category

📝 Abstract

By allowing the effects of $p$ covariates in a linear regression model to vary as functions of $R$ additional effect modifiers, varying-coefficient models (VCMs) strike a compelling balance between interpretable-but-rigid parametric models popular in classical statistics and flexible-but-opaque methods popular in machine learning. But in high-dimensional settings where $p$ and/or $R$ exceed the number of observations, existing approaches to fitting VCMs fail to identify which covariates have a non-zero effect and which effect modifiers drive these effects. We propose sparseVCBART, a fully Bayesian model that approximates each coefficient function in a VCM with a regression tree ensemble and encourages sparsity with a global--local shrinkage prior on the regression tree leaf outputs and a hierarchical prior on the splitting probabilities of each tree. We show that the sparseVCBART posterior contracts at a near-minimax optimal rate, automatically adapting to the unknown sparsity structure and smoothness of the true coefficient functions. Compared to existing state-of-the-art methods, sparseVCBART achieved competitive predictive accuracy and substantially narrower and better-calibrated uncertainty intervals, especially for null covariate effects. We use sparseVCBART to investigate how the effects of interpersonal conversations on prejudice could vary according to the political and demographic characteristics of the respondents.

Problem

Research questions and friction points this paper is trying to address.

Identifies sparse varying-coefficient models in high-dimensional settings

Determines which covariates have non-zero effects and modifiers

Handles high-dimensional scenarios where predictors exceed observations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian regression tree ensembles approximate coefficient functions

Global-local shrinkage prior encourages sparsity in outputs

Hierarchical prior on splitting probabilities enhances sparsity

🔎 Similar Papers

A variational Bayes approach to debiased inference for low-dimensional parameters in high-dimensional linear regression

2024-06-18arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

2026 Fall Applied Science Internship - Reinforcement Learning & Optimization (Machine Learning) - United States, PhD Student Science Recruiting

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

Research Scientist Intern, Optimization, Privacy and Inference (PhD)