🤖 AI Summary
This study addresses the lack of systematic understanding regarding test coverage of anomalous behaviors in real-world systems, particularly anomalies that do not propagate to test failures. For the first time, it jointly examines both propagated and non-propagated exceptions by dynamically instrumenting 25 Python projects, monitoring 5,372 methods, 17.9 million method calls, and 1.4 million exceptions. The analysis reveals that 21.4% of methods raise exceptions, with approximately 20% of these doing so frequently—exhibiting a median rate of one exception per ten invocations. These findings challenge the conventional assumption that exceptions are rare events and demonstrate that anomalous behavior is far more prevalent in practice than previously believed.
📝 Abstract
Exceptions allow developers to handle error cases expected to occur infrequently. Ideally, good test suites should test both normal and exceptional behaviors to catch more bugs and avoid regressions. While current research analyzes exceptions that propagate to tests, it does not explore other exceptions that do not reach the tests. In this paper, we provide an empirical study to explore how frequently exceptional behaviors are tested in real-world systems. We consider both exceptions that propagate to tests and the ones that do not reach the tests. For this purpose, we run an instrumented version of test suites, monitor their execution, and collect information about the exceptions raised at runtime. We analyze the test suites of 25 Python systems, covering 5,372 executed methods, 17.9M calls, and 1.4M raised exceptions. We find that 21.4% of the executed methods do raise exceptions at runtime. In methods that raise exceptions, on the median, 1 in 10 calls exercise exceptional behaviors. Close to 80% of the methods that raise exceptions do so infrequently, but about 20% raise exceptions more frequently. Finally, we provide implications for researchers and practitioners. We suggest developing novel tools to support exercising exceptional behaviors and refactoring expensive try/except blocks. We also call attention to the fact that exception-raising behaviors are not necessarily "abnormal" or rare.