🤖 AI Summary
This study addresses the widespread and critical security risk of API credential leakage in web frontends, particularly within JavaScript code. Conducting the first large-scale, systematic analysis of ten million websites, the authors combine static content analysis, credential pattern matching, and historical archive tracing to identify 1,748 unique credentials from 14 service providers across nearly 10,000 webpages—including those of global banks and firmware vendors. The research uncovers frontend-specific leakage vectors, propagation pathways, and persistent exposure issues, with some credentials remaining publicly accessible for several years. Through responsible disclosure efforts, the study significantly reduced the number of exposed credentials and facilitated real-world remediation by affected organizations.
📝 Abstract
Application programming interfaces (APIs) have become a central part of the modern IT environment, allowing developers to enrich the functionality of applications and interact with third parties such as cloud and payment providers. This interaction often occurs through authentication mechanisms that rely on sensitive credentials such as API keys and tokens that require secure handling. Exposure of these credentials can pose significant consequences to organizations, as malicious attackers can gain access to related services. Previous studies have shown exposure of these sensitive credentials in different environments such as cloud platforms and GitHub. However, the web remains unexplored.
In this paper, we study exposure of credentials on the web by analyzing 10M webpages. Our findings reveal that API credentials are widely and publicly exposed on the web, including highly popular and critical webpages such as those of global banks and firmware developers. We identify 1,748 distinct credentials from 14 service providers (e.g., cloud and payment providers) across nearly 10,000 webpages. Moreover, our analysis of archived data suggest credentials to remain exposed for periods ranging from a month to several years. We characterize web-specific exposure vectors and root causes, finding that most originate from JavaScript environments. We also discuss the outcomes of our responsible disclosure efforts that demonstrated a substantial reduction in credential exposure on the web.