Optimizing Reinforcement Learning Training over Digital Twin Enabled Multi-fidelity Networks

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the challenge of antenna tilt optimization in wireless networks with dynamically moving users, where base stations struggle to accurately track channel conditions and user mobility, leading to high data collection overhead and latency. To tackle this, the authors propose a digital twin-assisted multi-fidelity hierarchical reinforcement learning framework that jointly optimizes antenna tilt adjustment policies and the data sampling ratio between the physical network and its digital twin. The approach innovatively integrates robust adversarial loss with Proximal Policy Optimization (PPO) to enable co-optimization of control policy and data acquisition efficiency. Experimental results demonstrate that the proposed method reduces data collection latency in the physical network by up to 28.01% compared to baseline approaches while significantly improving aggregate user throughput.

Technology Category

Application Category

📝 Abstract

In this paper, we investigate a novel digital network twin (DNT) assisted deep learning (DL) model training framework. In particular, we consider a physical network where a base station (BS) uses several antennas to serve multiple mobile users, and a DNT that is a virtual representation of the physical network. The BS must adjust its antenna tilt angles to optimize the data rates of all users. Due to user mobility, the BS may not be able to accurately track network dynamics such as wireless channels and user mobilities. Hence, a reinforcement learning (RL) approach is used to dynamically adjust the antenna tilt angles. To train the RL, we can use data collected from the physical network and the DNT. The data collected from the physical network is more accurate but incurs more communication overhead compared to the data collected from the DNT. Therefore, it is necessary to determine the ratio of data collected from the physical network and the DNT to improve the training of the RL model. We formulate this problem as an optimization problem whose goal is to jointly optimize the tilt angle adjustment policy and the data collection strategy, aiming to maximize the data rates of all users while constraining the time delay introduced by collecting data from the physical network. To solve this problem, we propose a hierarchical RL framework that integrates robust adversarial loss and proximal policy optimization (PPO). Simulation results show that our proposed method reduces the physical network data collection delay by up to 28.01% and 1x compared to a hierarchical RL that uses vanilla PPO as the first level RL, and the baseline that uses robust-RL at the first level and selects the data collection ratio randomly.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Digital Twin

Multi-fidelity Networks

Data Collection Strategy

Antenna Tilt Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Digital Twin

Multi-fidelity Reinforcement Learning

Hierarchical RL