The Enhanced Physics-Informed Kolmogorov-Arnold Networks: Applications of Newton's Laws in Financial Deep Reinforcement Learning (RL) Algorithms

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of unstable training, poor generalization, and limited interpretability commonly encountered by conventional deep reinforcement learning approaches in financial trading. To overcome these limitations, the authors propose Physics-Informed Kolmogorov–Arnold Networks (PIKANs), which uniquely integrate physics-informed regularization with the Kolmogorov–Arnold Network (KAN) architecture. By incorporating a second-order temporal consistency constraint derived from Newtonian mechanics to replace traditional multilayer perceptrons, PIKANs enhance policy stability, interpretability, and generalization. Empirical evaluations demonstrate that the proposed method consistently outperforms baseline models across Chinese, U.S., and Vietnamese markets, achieving superior performance in cumulative returns, annualized return, Sharpe ratio, and Calmar ratio, while significantly reducing maximum drawdown.

Technology Category

Application Category

📝 Abstract
Deep Reinforcement Learning (DRL), a subset of machine learning focused on sequential decision-making, has emerged as a powerful approach for tackling financial trading problems. In finance, DRL is commonly used either to generate discrete trade signals or to determine continuous portfolio allocations. In this work, we propose a novel reinforcement learning framework for portfolio optimization that incorporates Physics-Informed Kolmogorov-Arnold Networks (PIKANs) into several DRL algorithms. The approach replaces conventional multilayer perceptrons with Kolmogorov-Arnold Networks (KANs) in both actor and critic components-utilizing learnable B-spline univariate functions to achieve parameter-efficient and more interpretable function approximation. During actor updates, we introduce a physics-informed regularization loss that promotes second-order temporal consistency between observed return dynamics and the action-induced portfolio adjustments. The proposed framework is evaluated across three equity markets-China, Vietnam, and the United States, covering both emerging and developed economies. Across all three markets, PIKAN-based agents consistently deliver higher cumulative and annualized returns, superior Sharpe and Calmar ratios, and more favorable drawdown characteristics compared to both standard DRL baselines and classical online portfolio-selection methods. This yields more stable training, higher Sharpe ratios, and superior performance compared to traditional DRL counterparts. The approach is particularly valuable in highly dynamic and noisy financial markets, where conventional DRL often suffers from instability and poor generalization.
Problem

Research questions and friction points this paper is trying to address.

Deep Reinforcement Learning
Portfolio Optimization
Financial Markets
Generalization
Training Stability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Informed Kolmogorov-Arnold Networks
Deep Reinforcement Learning
Portfolio Optimization
B-spline Function Approximation
Temporal Consistency Regularization
🔎 Similar Papers
T
Trang Thoi
Virginia Polytechnic Institute and State University, Blacksburg, USA
H
Hung Tran
Virginia Polytechnic Institute and State University, Blacksburg, USA
T
Tram Thoi
Ho Chi Minh University of Banking, Ho Chi Minh City, Vietnam
Huaiyang Zhong
Huaiyang Zhong
Assistant Professor, Virginia Tech