Breaking Android with AI: A Deep Dive into LLM-Powered Exploitation

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the feasibility and limitations of leveraging large language models (LLMs) for automated Android root privilege escalation—a critical task in penetration testing—addressing gaps in efficacy, reliability, and scalability. Method: We design an integrated automation framework combining PentestGPT, OpenAI’s API, and a custom web platform, deployed within Genymotion emulators to enable end-to-end vulnerability identification, AI-generated exploit script generation, and response analysis. Contribution/Results: (1) Empirical evaluation confirms LLMs significantly accelerate privilege escalation but require human validation for accuracy and compliance; (2) We identify key capability boundaries and high false-positive rates in mobile security contexts; (3) We propose a novel security and ethics framework for AI-augmented penetration testing. Overall, LLMs demonstrate practical utility in automating routine tasks but remain insufficient to replace expert human analysis in complex, high-stakes security assessments.

Technology Category

Application Category

📝 Abstract
The rapid evolution of Artificial Intelligence (AI) and Large Language Models (LLMs) has opened up new opportunities in the area of cybersecurity, especially in the exploitation automation landscape and penetration testing. This study explores Android penetration testing automation using LLM-based tools, especially PentestGPT, to identify and execute rooting techniques. Through a comparison of the traditional manual rooting process and exploitation methods produced using AI, this study evaluates the efficacy, reliability, and scalability of automated penetration testing in achieving high-level privilege access on Android devices. With the use of an Android emulator (Genymotion) as the testbed, we fully execute both traditional and exploit-based rooting methods, automating the process using AI-generated scripts. Secondly, we create a web application by integrating OpenAI's API to facilitate automated script generation from LLM-processed responses. The research focuses on the effectiveness of AI-enabled exploitation by comparing automated and manual penetration testing protocols, by determining LLM weaknesses and strengths along the way. We also provide security suggestions of AI-enabled exploitation, including ethical factors and potential misuse. The findings exhibit that while LLMs can significantly streamline the workflow of exploitation, they need to be controlled by humans to ensure accuracy and ethical application. This study adds to the increasing body of literature on AI-powered cybersecurity and its effect on ethical hacking, security research, and mobile device security.
Problem

Research questions and friction points this paper is trying to address.

Automating Android penetration testing using LLM tools
Evaluating AI-generated vs manual exploitation methods effectiveness
Assessing ethical implications of AI-powered cybersecurity exploitation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-powered Android penetration testing automation
AI-generated scripts for rooting techniques
Web app integrating OpenAI API automation
🔎 Similar Papers
No similar papers found.
W
Wanni Vidulige Ishan Perera
Sam Houston State University, USA
X
Xing Liu
Sam Houston State University, USA
F
Fan liang
Sam Houston State University, USA
Junyi Zhang
Junyi Zhang
Ph.D. Student, UC Berkeley
Computer VisionDeep LearningRobotics