Analyzing the Adoption of Database Management Systems Throughout the History of Open Source Projects

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
This study addresses the lack of systematic empirical analysis on the adoption and evolution of database management systems (DBMSs) in open-source projects. By examining the code history of 362 popular GitHub Java repositories, the work combines source-code heuristics, DB-Engines rankings, ORM detection, and version tracking to uncover long-term DBMS evolution patterns. The findings reveal that MySQL and PostgreSQL are the most prevalent relational DBMSs, while Redis and MongoDB exhibit stable usage among non-relational systems. HyperSQL is frequently replaced, and a “polyglot persistence” pattern—characterized by coexistence and cross-type collaboration of multiple DBMSs—is widespread. Moreover, distinct DBMSs demonstrate significantly different propensities for replacement, highlighting nuanced evolutionary dynamics in real-world software ecosystems.
📝 Abstract
Database Management Systems (DBMSs) are widely used to store, retrieve, and manage the data handled by modern applications. Although prior work has studied the co-evolution of DBMSs and application source code, less is known about DBMS adoption, co-use, and replacement in real systems. This paper presents a historical study of DBMS usage in 362 popular open-source Java projects hosted on GitHub. We investigated the adoption of the top DBMSs ranked by DB-Engines, covering relational and non-relational systems. Using source-code heuristics, we analyzed DBMS popularity, stability, migration patterns, co-occurrence, and the role of Object-Relational Mappers (ORMs). Our findings show that MySQL and PostgreSQL are the most popular DBMSs in our corpus. Among non-relational DBMSs, Redis and MongoDB are the most frequently used and tend to remain stable after adoption. In contrast, systems such as HyperSQL are more often replaced as projects evolve. We also observed frequent co-use of multiple DBMSs, suggesting patterns of polyglot persistence in which projects combine systems to handle different data needs. Finally, we found that ORM frameworks are commonly used to mediate interactions between applications and DBMSs. Overall, our study provides empirical evidence on how DBMSs are adopted, combined, and replaced over time, offering guidance for developers, architects, educators, and DBMS vendors.
Problem

Research questions and friction points this paper is trying to address.

Database Management Systems
DBMS adoption
open-source projects
polyglot persistence
DBMS migration
Innovation

Methods, ideas, or system contributions that make the work stand out.

DBMS adoption
polyglot persistence
ORM frameworks
database migration
open-source software evolution
🔎 Similar Papers
No similar papers found.
C
Camila A. Paiva
Instituto de Computação, Universidade Federal Fluminense, Brazil
R
Raquel Maximino
Instituto de Computação, Universidade Federal Fluminense, Brazil
F
Frederico Paiva
Instituto de Computação, Universidade Federal Fluminense, Brazil
R
Rafael Accetta Vieira
Instituto de Computação, Universidade Federal Fluminense, Brazil
N
Nicole Espanha
Instituto de Computação, Universidade Federal Fluminense, Brazil
J
João Felipe Pimentel
Instituto de Computação, Universidade Federal Fluminense, Brazil
Igor Wiese
Igor Wiese
UTFPR - Universidade Tecnológica Federal do Paraná
Mining Software RepositoriesSoftware EvolutionPrediction ModelsSoftware Dependencies
M
Marco Aurélio Gerosa
Northern Arizona University, USA
Igor Steinmacher
Igor Steinmacher
Northern Arizona University
Software EngineeringCSCWMining Software RepositoriesOpen Source Software
L
Leonardo Murta
Instituto de Computação, Universidade Federal Fluminense, Brazil
Vanessa Braganholo
Vanessa Braganholo
Professor of Computer Science, Universidade Federal Fluminense
DatabasesProvenance