Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Standard state normalization techniques in reinforcement learning (RL) for portfolio optimization discard absolute asset value information—such as nominal prices and market capitalizations—leading to substantial performance degradation in non-cryptocurrency markets (e.g., IBOVESPA, NYSE). Method: The study systematically compares two prevalent normalization approaches across three heterogeneous financial markets, analyzing their impact on numerical stability, economic interpretability, and generalization. Contribution/Results: It demonstrates that conventional preprocessing, while improving numerical conditioning, impairs the agent’s ability to perceive absolute economic magnitudes, thereby reducing risk-adjusted returns and cross-market transferability. The work advocates preserving interpretable, dimensionally consistent economic quantities—rather than applying blind standardization—in state representation design. Empirical results show that eliminating or reengineering normalization improves annualized returns by 12–28% and enhances strategy robustness, challenging the prevailing assumption in RL-based finance that normalization is universally beneficial.

Technology Category

Application Category

📝 Abstract

Recently, reinforcement learning has achieved remarkable results in various domains, including robotics, games, natural language processing, and finance. In the financial domain, this approach has been applied to tasks such as portfolio optimization, where an agent continuously adjusts the allocation of assets within a financial portfolio to maximize profit. Numerous studies have introduced new simulation environments, neural network architectures, and training algorithms for this purpose. Among these, a domain-specific policy gradient algorithm has gained significant attention in the research community for being lightweight, fast, and for outperforming other approaches. However, recent studies have shown that this algorithm can yield inconsistent results and underperform, especially when the portfolio does not consist of cryptocurrencies. One possible explanation for this issue is that the commonly used state normalization method may cause the agent to lose critical information about the true value of the assets being traded. This paper explores this hypothesis by evaluating two of the most widely used normalization methods across three different markets (IBOVESPA, NYSE, and cryptocurrencies) and comparing them with the standard practice of normalizing data before training. The results indicate that, in this specific domain, the state normalization can indeed degrade the agent's performance.

Problem

Research questions and friction points this paper is trying to address.

Evaluates normalization methods in RL portfolio optimization

Assesses impact on performance across diverse financial markets

Identifies state normalization as potential performance degrader

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates normalization methods in finance

Compares three different market datasets

Analyzes state normalization impact

🔎 Similar Papers

No similar papers found.