🤖 AI Summary
To address low exploration efficiency and insufficient coverage in GUI testing of web applications—caused by state-space explosion and complex interaction logic—this paper proposes WebRLED, an automated testing framework based on deep reinforcement learning. Its key contributions are: (1) a novel grid-based action-value learning mechanism that models the action space with explicit awareness of DOM tree structure; (2) an online-updatable action discriminator that dynamically refines feasibility predictions for candidate actions; and (3) a dual-scale curiosity-driven reward model integrating local novelty estimation with global execution history to mitigate sparse-reward challenges. Evaluated on 62 real-world web applications—including 12 open-source projects—WebRLED significantly outperforms state-of-the-art methods, achieving substantial improvements in code coverage, state coverage, and defect detection rate. It identified 695 unique failure cases, demonstrating superior effectiveness and robustness.
📝 Abstract
Automated GUI testing of web applications has always been considered a challenging task considering their large state space and complex interaction logic. Deep Reinforcement Learning (DRL) is a recent extension of Reinforcement Learning (RL), which takes advantage of the powerful learning capabilities of neural networks, making it suitable for complex exploration space. In this paper, leveraging the capability of deep reinforcement learning, we propose WebRLED, an effective approach for automated GUI testing of complex web applications. WebRLED has the following characteristics: (1) a grid-based action value learning technique, which can improve the efficiency of state space exploration; (2) a novel action discriminator which can be trained during the exploration to identify more actions; (3) an adaptive, curiosity-driven reward model, which considers the novelty of an explored state within an episode and global history, and can guide exploration continuously. We conduct a comprehensive evaluation of WebRLED on 12 open-source web applications and a field study of the top 50 most popular web applications in the world. The experimental results show that WebRLED achieves higher code/state coverage and failure detection rate compared to existing state-of-the-art (SOTA) techniques. Furthermore, WebRLED finds 695 unique failures in 50 real-world applications.