🤖 AI Summary
Existing learning-based ESG investment approaches rely on static scores, suffering from high noise, delayed updates, and temporal misalignment with financial decision-making. This work proposes treating ESG as a dynamic constraint by introducing a Multimodal Action-Conditioned Constraint Field (MACF) that learns mechanism-specific ESG costs from real-time, multi-source evidence. We further design MACF-X, an optimizer-native adapter that seamlessly translates these costs and their associated uncertainties into time-varying constraints. Without perturbing policy observations or rewards, our method substantially alleviates tail-end ESG budget pressure while maintaining strong financial performance. Ablation studies confirm the critical roles of dynamic evidence inputs and the tri-head decomposition architecture, and reveal that conventional static ESG scores behave nearly as random noise.
📝 Abstract
ESG-aware portfolio optimization is increasingly important for sustainable capital allocation, yet most learning-based methods still operationalize ESG by appending static scores to the policy observation or reward. This creates a mismatch for sequential control: ESG scores are noisy, provider-dependent, low-frequency, and temporally misaligned with sequential portfolio decisions, while financial evidence suggests that ESG is better treated as a portfolio preference, risk-exposure, or hedge dimension than as a robust alpha factor. We propose to impose ESG constraints without modifying the financial policy's observation or reward, using a Multimodal Action-Conditioned Constraint Field (MACF) that learns mechanism-specific ESG costs from point-in-time multimodal evidence and contemplated portfolio transitions. We then introduce MACF-X, a family of optimizer-specific adapters that converts MACF costs and uncertainties into native constrained-optimization interfaces through a shared slack- and uncertainty-aware pressure layer. Across multiple constraint-integration interfaces, MACF-X reduces tail ESG budget pressure while maintaining competitive financial performance. Ablations show that this improvement depends on dynamic evidence inputs and three-head decomposition, while static ESG-score proxies are nearly indistinguishable from score-shuffled noise baselines.