🤖 AI Summary
This study addresses the longstanding lack of a systematic review on breaking changes in software ecosystems, which has led to fragmented understanding. Through a systematic literature review of 97 studies across five major ecosystems, the work proposes a four-dimensional taxonomy and constructs a multidimensional classification framework. It identifies maintenance and design improvements as the primary drivers of breaking changes and exposes trust failures in semantic versioning practices. Integrating qualitative and quantitative approaches, the research encompasses syntactic and behavioral change detection, dependency propagation, and ecosystem governance, synthesizing 43 detection methods and 66 mitigation strategies. While syntactic change detection demonstrates high accuracy, coverage of behavioral changes remains insufficient. The study culminates in actionable practice guidelines and highlights three key research opportunities and challenges, including leveraging large language models for behavioral contract inference.
📝 Abstract
Modern software systems rely on dependency networks of reusable libraries, where breaking changes propagate and cause downstream consumers to fail. Despite growing research across ecosystems, no comprehensive synthesis exists. We conduct a systematic literature review of 97 primary studies, answering four research questions across five ecosystems: Maven/Java, npm/JavaScript, Python, Web APIs, and Linux distributions. The synthesis yields four results. First, a four-dimensional taxonomy along Nature, Detectability, Scope, and Visibility. Second, five reason categories and five impact dimensions, where maintenance and design improvements account for a larger share of breaking changes than new feature work. Third, 43 detection approaches that reach high accuracy on syntactic breaks but limited coverage on behavioral ones. Fourth, 66 strategies for communicating, preventing, and recovering from breaking changes, organized by the actor's role. Based on these findings, we identify three open challenges and three research opportunities. The challenges are behavioral break detection at scale, the failure of semantic versioning as a trust mechanism, and transitive dependency propagation under information asymmetry. The opportunities are LLM-augmented behavioral contract inference, ecosystem-level dependency graph intelligence, and domain-specific tooling for ML and data science.