Incentive-Aligned Vehicle-to-Vehicle Energy Trading via Nash-Integrated Multi-Agent Reinforcement Learning

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

243K/year

🤖 AI Summary

This study addresses the challenges of heterogeneous self-interested charging demands and uncertain arrival-departure times in decentralized vehicle-to-vehicle (V2V) energy trading. To tackle these issues, the authors propose the Nash-MADDPG algorithm, which uniquely integrates the Nash bargaining solution into a multi-agent deep deterministic policy gradient framework. By employing bilateral dynamic pricing and a bargaining-oriented price-proximity reward mechanism, the method guides agents to learn incentive-compatible trading strategies without requiring central coordination. The approach guarantees convergence to the bargaining-optimal solution and demonstrates superior performance in a 30-day simulation, achieving 61.6% higher social welfare, 62.9% greater trading volume, and a 40.1% improvement in the Jain fairness index compared to double auction mechanisms. Scalability and price stability are further validated across agent populations ranging from 6 to 100.

📝 Abstract

Vehicle-to-vehicle (V2V) energy trading enables decentralized peer-to-peer energy exchange among electric vehicles (EVs), reducing grid dependency while monetizing surplus capacity. However, coordinating self-interested EV agents with diverse charging needs and uncertain arrival-departure schedules remains challenging. Existing approaches either require centralized optimization with computational limitations or lack fairness guarantees. This paper integrates Nash Bargaining Solution into Multi-Agent Deep Deterministic Policy Gradient, namely Nash-MADDPG, for incentive-aligned V2V energy trading. Nash bargaining determines efficient bilateral pricing, while Nash-guided price proximity rewards align agent learning toward bargaining-optimal strategies. Evaluation over 30-day continuous operation demonstrates an improvement of 61.6% in social welfare and 62.9% improvement in trading volume over Double Auction, while achieving superior fairness, such as 40.1% improvement in Jain's index. Testing across 6-100 agents over a 30-day horizon with continuous vehicle turnover confirms scalability across population size and empirically stable pricing near the Nash Bargaining benchmark.

Problem

Research questions and friction points this paper is trying to address.

Vehicle-to-Vehicle energy trading

multi-agent coordination

incentive alignment

fairness

decentralized energy exchange

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nash Bargaining

Multi-Agent Reinforcement Learning

Vehicle-to-Vehicle Energy Trading