Testing Clustered Equal Predictive Ability with Unknown Clusters

📅 2025-07-19

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This paper addresses the “clustering-equal predictive ability” (C-EPA) test for panel data when the underlying clustering structure is unknown. We propose a novel method integrating Panel K-means clustering with selective conditional inference. The approach identifies latent clusters via time-series variation and—uniquely in this context—introduces selective inference into clustering-based predictive ability testing. It constructs a Wald-type test statistic that follows a truncated chi-square distribution and designs a p-value combination strategy enabling both pairwise and joint hypothesis testing without additional conditioning for cluster-number selection. Cluster count is determined automatically via information criteria grounded in theoretical justification. Monte Carlo simulations demonstrate excellent finite-sample properties, including size control and power. An empirical application to exchange rate forecasting models successfully distinguishes predictive performance differences between traditional time-series and machine learning approaches.

Technology Category

Application Category

📝 Abstract

We develop new tests of clustered equal predictive ability (C-EPA) in panels where the clusters are unknown and estimated by a Panel Kmeans algorithm. This algorithm differs from the standard Kmeans algorithm by employing the time series variation of the panel rather than relying merely on time averages of observations. To address the challenge of testing hypotheses that depend on data-driven cluster estimates, we adopt a selective conditional inference framework. Specifically, we derive a Wald-type test statistic for pairwise equality and show that the limiting distribution of its square root conditional on the estimated cluster structure is that of a truncated $χ$ random variable. We characterize the associated truncation set as a polyhedron in the data space. As a test of the C-EPA hypothesis, we propose a $p$-value combination method which aggregates the evidence against the pairwise equality and overall EPA null hypotheses. In addition, we prove that using an information criterion to select the unknown number of clusters under the alternative hypothesis prior to testing does not require further conditioning to obtain a valid test. Monte Carlo simulations confirm the excellent finite sample performance of the proposed tests. An empirical application to forecasting exchange rates using traditional time series models as well as machine learning methods illustrates the practical importance of our procedure.

Problem

Research questions and friction points this paper is trying to address.

Testing equal predictive ability with unknown clusters

Developing selective inference for data-driven cluster estimates

Validating cluster selection without additional conditioning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Panel Kmeans algorithm uses time series variation

Selective conditional inference for data-driven clusters

P-value combination method tests C-EPA hypothesis

🔎 Similar Papers

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure