🤖 AI Summary
This study investigates the persistent usage of Log4j 1.x—an end-of-life logging framework since 2015—in contemporary open-source projects, challenging the industry assumption that deprecated dependencies are promptly abandoned.
Method: Leveraging the MSR 2025 Challenge dataset, we conduct an empirical analysis of build logs and dependency graphs from over 10,000 open-source projects, integrating the Goblin framework, log parsing, time-series trend modeling, and project metadata mining.
Contribution/Results: We uncover that approximately 12% of projects initiated after 2022 actively reintroduce Log4j 1.x, demonstrating systematic reinfection rather than legacy carryover. This reveals a novel mechanism of technical debt propagation across project generations—termed “lifecycle mismatch”—where newly created projects inherit obsolete, unsupported dependencies. Our findings provide critical empirical evidence for security governance, dependency evolution modeling, and technical debt management, offering a new analytical lens on dependency hygiene and ecosystem-wide risk accumulation.
📝 Abstract
Log4j has become a widely adopted logging library for Java programs due to its long history and high reliability. Its widespread use is notable not only because of its maturity but also due to the complexity and depth of its features, which have made it an essential tool for many developers. However, Log4j 1.x, which reached its end of support (deprecated), poses significant security risks and has numerous deprecated features that can be exploited by attackers. Despite this, some clients may still rely on this library. We aim to understand whether clients are still using Log4j 1.x despite its official support ending. We utilized the Mining Software Repositories 2025 challenge dataset, which provides a large and representative sample of open-source software projects. We analyzed over 10,000 log entries from the Mining Software Repositories 2025 challenge dataset using the Goblin framework to identify trends in usage rates for both Log4j 1.x and Log4j-core 2.x. Specifically, our study addressed two key issues: (1) We examined the usage rates and trends for these two libraries, highlighting any notable differences or patterns in their adoption. (2) We demonstrate that projects initiated after a deprecated library has reached the end of its support lifecycle can still maintain significant popularity. These findings highlight how deprecated are still popular, with the next step being to understand the reasoning behind these adoptions.