pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods

📅 2025-02-07

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

Existing research on Python unit test generation is hindered by the absence of large-scale, high-quality, publicly available test-to-unit mappings—primarily due to Python’s weak naming conventions, which impede precise association between test methods and their corresponding units under test. Method: We propose PyTestMap, the first large-scale open-source Python unit test dataset, comprising over 2.2 million rigorously validated test-method–unit-under-test mappings. Our approach introduces a novel traceability method combining heuristic rules and static analysis, integrating cross-file call-graph inference with pattern matching to overcome naming ambiguity. It further supports context-augmented LLM training input construction. Contribution/Results: PyTestMap is derived from 88K+ GitHub repositories and encompasses 22M raw test methods. Hosted on Zenodo, it serves as a benchmark resource for training and evaluating Python test-generation models.

Technology Category

Application Category

📝 Abstract

Python is one of the fastest-growing programming languages and currently ranks as the top language in many lists, even recently overtaking JavaScript as the top language on GitHub. Given its importance in data science and machine learning, it is imperative to be able to effectively train LLMs to generate good unit test cases for Python code. This motivates the need for a large dataset to provide training and testing data. To date, while other large datasets exist for languages like Java, none publicly exist for Python. Python poses difficult challenges in generating such a dataset, due to its less rigid naming requirements. In this work, we consider two commonly used Python unit testing frameworks: Pytest and unittest. We analyze a large corpus of over 88K open-source GitHub projects utilizing these testing frameworks. Using a carefully designed set of heuristics, we are able to locate over 22 million test methods. We then analyze the test and non-test code and map individual unit tests to the focal method being tested. This provides an explicit traceability link from the test to the tested method. Our pyMethods2Test dataset contains over 2 million of these focal method mappings, as well as the ability to generate useful context for input to LLMs. The pyMethods2Test dataset is publicly available on Zenodo at: https://doi.org/10.5281/zenodo.14264518

Problem

Research questions and friction points this paper is trying to address.

Python unit test dataset

Mapping tests to methods

Training LLMs for test generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes 88K GitHub projects

Maps tests to focal methods

Generates context for LLMs

🔎 Similar Papers

TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark