🤖 AI Summary
This study investigates improved methods for forecasting the realized variance of constituents of the Dow Jones Industrial Average. Under a setting that employs only lagged daily, weekly, and monthly realized variances as predictors, the predictive performance of regularized regression, regression trees, neural networks, and the classical Heterogeneous Autoregressive (HAR) model is systematically compared. The results demonstrate that machine learning approaches consistently outperform the HAR model—even with minimal hyperparameter tuning—particularly in longer-horizon forecasts, and effectively extract incremental predictive information. Furthermore, the paper innovatively applies Accumulated Local Effects (ALE) analysis to uncover divergent rankings of predictor importance across models, thereby enhancing the interpretability of volatility modeling and offering new insights into the drivers of realized variance dynamics.
📝 Abstract
We inspect how accurate machine learning (ML) is at forecasting realized variance of the Dow Jones Industrial Average index constituents. We compare several ML algorithms, including regularization, regression trees, and neural networks, to multiple Heterogeneous AutoRegressive (HAR) models. ML is implemented with minimal hyperparameter tuning. In spite of this, ML is competitive and beats the HAR lineage, even when the only predictors are the daily, weekly, and monthly lags of realized variance. The forecast gains are more pronounced at longer horizons. We attribute this to higher persistence in the ML models, which helps to approximate the long-memory of realized variance. ML also excels at locating incremental information about future volatility from additional predictors. Lastly, we propose a ML measure of variable importance based on accumulated local effects. This shows that while there is agreement about the most important predictors, there is disagreement on their ranking, helping to reconcile our results.