Consistent Selection of the Number of Groups in Panel Models via Cross-Validation

📅 2022-09-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In panel data group-wise modeling, there lacks a consistent, data-driven criterion for selecting the number of groups. This paper proposes a time-split cross-validation method: retaining the individual-group structure, it partitions the time dimension into estimation and evaluation folds to separately estimate group memberships and parameters, then selects the optimal number of groups based on the average performance of a design criterion (e.g., prediction error). The method is fully data-driven, requires no additional hyperparameter tuning, accommodates diverse panel models, and theoretically guarantees consistency of both group-number and parameter estimation. Empirical evaluations across multiple synthetic designs and an application to heterogeneity in Chinese stock market volatility demonstrate substantial improvements in group-number selection accuracy and model robustness. The proposed framework provides a generalizable, principled criterion for high-dimensional panel group-wise modeling.
📝 Abstract
Group number selection is a key problem for group panel data modeling. In this work, we develop a cross-validation (CV) method to tackle this problem. Specifically, we split the panel data into two data folds on the time span, with group structure preserved for individuals. We first estimate the group memberships and parameters on one data fold, then we plug in the estimates and utilize the other data fold to evaluate a designed criterion. Subsequently, the group number is estimated by minimizing the average criterion across all data folds. The proposed CV method has two advantages compared to existing approaches. First, the method is totally data-driven, thus no further tuning parameters are involved. Second, the method can be flexibly applied to a wide range of panel data models. Theoretically, we establish the estimation consistency by taking advantage of the optimization property of the estimation algorithm. Experiments are carried out with a variety of synthetic datasets and panel models to further illustrate the advantages of the proposed method. Lastly, the CV method is employed to analyze the heterogeneous patterns of stock volatilities in the Chinese stock market through the financial crisis.
Problem

Research questions and friction points this paper is trying to address.

Selecting optimal group number in panel data models
Developing data-driven cross-validation method for group selection
Ensuring consistent group estimation without tuning parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-validation method for group number selection
Data-driven approach without tuning parameters
Applicable to diverse panel data models
🔎 Similar Papers
No similar papers found.