Evaluating Large Language Models in detecting Secrets in Android Apps

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Android applications frequently hardcode authentication credentials—such as API keys—exposing them to reverse-engineering attacks and leading to data breaches and API abuse. Existing detection techniques—including regex-based, static analysis, and machine learning approaches—rely on predefined patterns or labeled training data, limiting their ability to identify novel secret types. This paper introduces SecretLoc, the first unsupervised framework leveraging large language models (LLMs) for detecting hardcoded secrets in mobile applications. By jointly modeling semantic meaning and structural context, SecretLoc identifies previously unknown secret categories without requiring domain-specific prior knowledge. Evaluated on a literature-derived dataset, it discovers 4,828 previously undetected secrets spanning over ten new secret types. Applied to 5,000 Google Play apps, it identifies 2,124 apps containing hardcoded secrets, several of which have since been patched. This work establishes a new paradigm for security detection using LLMs and highlights their dual-use implications in both offensive and defensive cybersecurity contexts.

Technology Category

Application Category

📝 Abstract

Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often hardcode these credentials into Android apps, exposing them to extraction through reverse engineering. Once compromised, adversaries can exploit secrets to access sensitive data, manipulate resources, or abuse APIs, resulting in significant security and financial risks. Existing detection approaches, such as regex-based analysis, static analysis, and machine learning, are effective for identifying known patterns but are fundamentally limited: they require prior knowledge of credential structures, API signatures, or training data. In this paper, we propose SecretLoc, an LLM-based approach for detecting hardcoded secrets in Android apps. SecretLoc goes beyond pattern matching; it leverages contextual and structural cues to identify secrets without relying on predefined patterns or labeled training sets. Using a benchmark dataset from the literature, we demonstrate that SecretLoc detects secrets missed by regex-, static-, and ML-based methods, including previously unseen types of secrets. In total, we discovered 4828 secrets that were undetected by existing approaches, discovering more than 10 "new" types of secrets, such as OpenAI API keys, GitHub Access Tokens, RSA private keys, and JWT tokens, and more. We further extend our analysis to newly crawled apps from Google Play, where we uncovered and responsibly disclosed additional hardcoded secrets. Across a set of 5000 apps, we detected secrets in 2124 apps (42.5%), several of which were confirmed and remediated by developers after we contacted them. Our results reveal a dual-use risk: if analysts can uncover these secrets with LLMs, so can attackers. This underscores the urgent need for proactive secret management and stronger mitigation practices across the mobile ecosystem.

Problem

Research questions and friction points this paper is trying to address.

Detecting hardcoded authentication secrets in Android mobile applications

Overcoming limitations of pattern-based detection methods for credentials

Identifying previously unknown types of secrets through contextual analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to detect hardcoded secrets in Android apps

Leverages contextual cues without predefined patterns or training

Identifies previously unseen secret types missed by existing methods

🔎 Similar Papers

No similar papers found.