Deep Reinforcement Learning-based Cell DTX/DRX Configuration for Network Energy Saving

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the energy-delay trade-off in cell-level DTX/DRX configuration under 3GPP Release 18, this paper proposes a dynamic optimization framework based on deep reinforcement learning (DRL). We introduce a contextual bandit (CB)-enhanced deep Q-network architecture and design a smooth, differentiable reward function that approximates the theoretically optimal—but discontinuous—joint QoS-energy-efficiency objective. The method adaptively adjusts DTX/DRX patterns in response to varying traffic loads and channel conditions. System-level simulations demonstrate up to 45% reduction in base station energy consumption while strictly limiting QoS degradation for end-to-end latency-sensitive services to within 1%, outperforming both static configurations and conventional RL baselines.

Technology Category

Application Category

📝 Abstract

3GPP Release 18 cell discontinuous transmission and reception (cell DTX/DRX) is an important new network energy saving feature for 5G. As a time-domain technique, it periodically aggregates the user data transmissions in a given duration of time when the traffic load is not heavy, so that the remaining time can be kept silent and advanced sleep modes (ASM) can be enabled to shut down more radio components and save more energy for the cell. However, inevitably the packet delay is increased, as during the silent period no transmission is allowed. In this paper we study how to configure cell DTX/DRX to optimally balance energy saving and packet delay, so that for delay-sensitive traffic maximum energy saving can be achieved while the degradation of quality of service (QoS) is minimized. As the optimal configuration can be different for different network and traffic conditions, the problem is complex and we resort to deep reinforcement learning (DRL) framework to train an AI agent to solve it. Through careful design of 1) the learning algorithm, which implements a deep Q-network (DQN) on a contextual bandit (CB) model, and 2) the reward function, which utilizes a smooth approximation of a theoretically optimal but discontinuous reward function, we are able to train an AI agent that always tries to select the best possible Cell DTX/DRX configuration under any network and traffic conditions. Simulation results show that compared to the case when cell DTX/DRX is not used, our agent can achieve up to ~45% energy saving depending on the traffic load scenario, while always maintaining no more than ~1% QoS degradation.

Problem

Research questions and friction points this paper is trying to address.

Balancing energy saving and packet delay in 5G cell DTX/DRX configuration

Optimizing cell DTX/DRX for varying network and traffic conditions

Minimizing QoS degradation while maximizing energy efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Reinforcement Learning for energy-delay balance

Contextual Bandit model with Deep Q-Network

Smooth reward function approximates optimal solution

🔎 Similar Papers

A Deep RL Approach on Task Placement and Scaling of Edge Resources for Cellular Vehicle-to-Network Service Provisioning