Random Forest Weighted Local Fréchet Regression with Random Objects

📅 2022-02-10
🏛️ Journal of machine learning research
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Fréchet regression—designed for responses in metric spaces (e.g., distribution functions, SPD matrices, spherical data) with Euclidean predictors—relies heavily on nonparametric kernel smoothing, suffering from the curse of dimensionality. Method: We propose the first random-forest-based Fréchet regression framework, introducing adaptive local weighting via tree ensembles and developing both local-constant and local-linear estimators. Contribution/Results: We establish strong consistency, optimal convergence rates, and asymptotic normality under mild regularity conditions, leveraging novel technical tools including infinite-order U-processes and $M_{m_n}$-estimation theory. The method significantly outperforms existing approaches in simulations and real-data applications—including New York City taxi trajectories and human mortality surfaces—while recovering classical random forest asymptotics as a special case when responses lie in Euclidean space. Our work unifies theoretical rigor with broad practical applicability across diverse metric response domains.
📝 Abstract
Statistical analysis is increasingly confronted with complex data from metric spaces. Petersen and M""uller (2019) established a general paradigm of Fr'echet regression with complex metric space valued responses and Euclidean predictors. However, the local approach therein involves nonparametric kernel smoothing and suffers from the curse of dimensionality. To address this issue, we in this paper propose a novel random forest weighted local Fr'echet regression paradigm. The main mechanism of our approach relies on a locally adaptive kernel generated by random forests. Our first method uses these weights as the local average to solve the conditional Fr'echet mean, while the second method performs local linear Fr'echet regression, both significantly improving existing Fr'echet regression methods. Based on the theory of infinite order U-processes and infinite order $M_{m_n}$-estimator, we establish the consistency, rate of convergence, and asymptotic normality for our local constant estimator, which covers the current large sample theory of random forests with Euclidean responses as a special case. Numerical studies show the superiority of our methods with several commonly encountered types of responses such as distribution functions, symmetric positive-definite matrices, and sphere data. The practical merits of our proposals are also demonstrated through the application to New York taxi data and human mortality data.
Problem

Research questions and friction points this paper is trying to address.

Addresses dimensionality curse in local Fréchet regression
Proposes random forest weighted local Fréchet regression
Improves regression for complex metric space data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random forest weighted local Fréchet regression
Locally adaptive kernel via random forests
Improved Fréchet regression for complex data
R
Rui Qiu
School of Statistics, KLATASDS-MOE, East China Normal University, Shanghai 200062, China
Z
Zhou Yu
School of Statistics, KLATASDS-MOE, East China Normal University, Shanghai 200062, China
Ruoqing Zhu
Ruoqing Zhu
University of Illinois Urbana-Champaign
Personalized MedicineReinforcement LearningRandom ForestsSurvival AnalysisDimension Reduction