🤖 AI Summary
Existing automated GUI testing approaches struggle to emulate the behavioral diversity exhibited by real-world crowdtesters. To address this limitation, this work proposes PersonaTester, a novel framework that, for the first time, integrates a three-dimensional personality model—encompassing testing mindset, exploration strategy, and interaction habits—into large language model (LLM)-driven GUI testing agents. This enables high-fidelity, controllable, and diverse automated crowdtesting. Experimental results demonstrate that PersonaTester significantly enhances inter-agent behavioral diversity (by 117.86%–126.23%) while preserving intra-agent consistency. The approach uncovered over 100 crashes and 11 functional defects, substantially outperforming baseline methods in both defect detection efficacy and behavioral realism.
📝 Abstract
The rapid proliferation and increasing complexity of software demand robust quality assurance, with graphical user interface (GUI) testing playing a pivotal role. Crowdsourced testing has proven effective in this context by leveraging the diversity of human testers to achieve rich, scenario-based coverage across varied devices, user behaviors, and usage environments. In parallel, automated testing, particularly with the advent of large language models (LLMs), offers significant advantages in controllability, reproducibility, and efficiency, enabling scalable and systematic exploration. However, automated approaches often lack the behavioral diversity characteristic of human testers, limiting their capability to fully simulate real-world testing dynamics. To address this gap, we present PersonaTester, a novel personified-LLM-based framework designed to automate crowdsourced GUI testing. By injecting representative personas, defined along three orthogonal dimensions: testing mindset, exploration strategy, and interaction habit, into LLM-based agents, PersonaTester enables the simulation of diverse human-like testing behaviors in a controllable and repeatable manner. Experimental results demonstrate that PersonaTester faithfully reproduces the behavioral patterns of real crowdworkers, exhibiting strong intra-persona consistency and clear inter-persona variability (117.86% -- 126.23% improvement over the baseline). Moreover, persona-guided testing agents consistently generate more effective test events and trigger more crashes (100+) and functional bugs (11) than the baseline without persona, thus substantially advancing the realism and effectiveness of automated crowdsourced GUI testing.