LELANTE: LEveraging LLM for Automated ANdroid TEsting

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Manual scripting of Appium/Espresso test cases for Android applications incurs high maintenance overhead and poor adaptability to UI evolution. Method: This paper proposes an end-to-end, natural language–driven automated testing framework that eliminates manual coding by directly parsing natural language test specifications and executing actions via GUI screen understanding and device interaction. Contributions/Results: (1) A novel screen refinement mechanism improves large language models’ (LLMs) visual perception accuracy on mobile interfaces; (2) A structured, chain-of-reasoning prompting framework enhances logical consistency in action planning; (3) A lightweight LLM distillation strategy reduces deployment cost. Evaluated on 390 real-world test cases across 10 mainstream Android apps, the approach achieves a 73% execution success rate—demonstrating significant improvements in test scalability and robustness against UI changes.

Technology Category

Application Category

📝 Abstract
Given natural language test case description for an Android application, existing testing approaches require developers to manually write scripts using tools such as Appium and Espresso to execute the corresponding test case. This process is labor-intensive and demands significant effort to maintain as UI interfaces evolve throughout development. In this work, we introduce LELANTE, a novel framework that utilizes large language models (LLMs) to automate test case execution without requiring pre-written scripts. LELANTE interprets natural language test case descriptions, iteratively generate action plans, and perform the actions directly on the Android screen using its GUI. LELANTE employs a screen refinement process to enhance LLM interpretability, constructs a structured prompt for LLMs, and implements an action generation mechanism based on chain-of-thought reasoning of LLMs. To further reduce computational cost and enhance scalability, LELANTE utilizes model distillation using a foundational LLM. In experiments across 390 test cases spanning 10 popular Android applications, LELANTE achieved a 73% test execution success rate. Our results demonstrate that LLMs can effectively bridge the gap between natural language test case description and automated execution, making mobile testing more scalable and adaptable.
Problem

Research questions and friction points this paper is trying to address.

Automating Android test execution without manual scripting
Reducing maintenance effort for evolving UI interfaces
Bridging natural language descriptions to automated test actions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to automate Android test execution
Employs screen refinement for better LLM interpretability
Applies model distillation to reduce computational cost
🔎 Similar Papers
No similar papers found.
S
Shamit Fatin
Bangladesh University of Engineering and Technology, Bangladesh
M
Mehbubul Hasan Al-Quvi
Bangladesh University of Engineering and Technology, Bangladesh
Sukarna Barua
Sukarna Barua
Associate Professor , Computer Science & Engineering, BUET
Data MiningMachine LearningPattern Recognition
Anindya Iqbal
Anindya Iqbal
Professor, Dept of CSE, BUET
Software EngineeringApplied Machine LearningSecurity and PrivacyParticipatory SensingWireless Sensor Network
S
Sadia Sharmin
Bangladesh University of Engineering and Technology, Bangladesh
M
Md. Mostofa Akbar
Bangladesh University of Engineering and Technology, Bangladesh
H
H. S. Shahgir
University of California, Riverside, USA
K
Kallol Kumar Pal
Samsung R&D Institute Bangladesh, Bangladesh
A
A. A. A. Rashid
Samsung R&D Institute Bangladesh, Bangladesh