🤖 AI Summary
Conventional option pricing models, such as the binomial tree, neglect market microstructure, leading to systematic pricing biases. Method: This paper proposes a novel binomial option pricing framework integrating microstructural features: it employs random forests—first applied in this context—to model path-dependent transition probabilities and incorporates high-frequency microstructural variables (e.g., order flow imbalance), while ensuring strict no-arbitrage compliance through data-driven, dynamic calibration. The methodology encompasses high-frequency data preprocessing, microstructural feature extraction, and path-dependent probabilistic modeling. Results: Evaluated on SPY minute-level data, the model achieves an AUC of 88.25% for price-change prediction, with order flow imbalance contributing 43.2% to predictive performance. Relative to the Black–Scholes model, it reduces average option pricing error by 13.79%, markedly enhancing empirical realism and economic interpretability.
📝 Abstract
We propose a machine learning-based extension of the classical binomial option pricing model that incorporates key market microstructure effects. Traditional models assume frictionless markets, overlooking empirical features such as bid-ask spreads, discrete price movements, and serial return correlations. Our framework augments the binomial tree with path-dependent transition probabilities estimated via Random Forest classifiers trained on high-frequency market data. This approach preserves no-arbitrage conditions while embedding real-world trading dynamics into the pricing model.
Using 46,655 minute-level observations of SPY from January to June 2025, we achieve an AUC of 88.25% in forecasting one-step price movements. Order flow imbalance is identified as the most influential predictor, contributing 43.2% to feature importance. After resolving time-scaling inconsistencies in tree construction, our model yields option prices that deviate by 13.79% from Black-Scholes benchmarks, highlighting the impact of microstructure on fair value estimation. While computational limitations restrict the model to short-term derivatives, our results offer a robust, data-driven alternative to classical pricing methods grounded in empirical market behavior.