Fixturize: Bridging the Fixture Gap in Test Generation

📅 2026-01-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the common failure of large language models (LLMs) to generate necessary test fixtures when automatically producing unit tests, which often leads to invalid or ineffective test cases. To overcome this limitation, the authors propose Fixturize, a novel framework that introduces fixture-awareness for the first time through a diagnostic identification and iterative feedback mechanism to actively synthesize missing test fixtures in coordination with LLMs and search-based testing tools. The study contributes FixtureEval, the first benchmark annotated with fixture dependencies, thereby filling a critical gap in automated testing pipelines. Evaluated on Python and Java, Fixturize achieves fixture dependency identification accuracy of 88.38%–97.00%, improves SuitePS by 18.03%–42.86%, and increases line and branch coverage by up to 31.54% and 119.66%, respectively.

Technology Category

Application Category

📝 Abstract
Current Large Language Models (LLMs) have advanced automated unit test generation but face a critical limitation: they often neglect to construct the necessary test fixtures, which are the environmental setups required for a test to run. To bridge this gap, this paper proposes Fixturize, a diagnostic framework that proactively identifies fixture-dependent functions and synthesizes test fixtures accordingly through an iterative, feedback-driven process, thereby improving the quality of auto-generated test suites of existing approaches. For rigorous evaluation, the authors introduce FixtureEval, a dedicated benchmark comprising 600 curated functions across two Programming Languages (PLs), i.e., Python and Java, with explicit fixture dependency labels, enabling both the corresponding classification and generation tasks. Empirical results demonstrate that Fixturize is highly effective, achieving 88.38%-97.00% accuracy across benchmarks in identifying the dependence of test fixtures and significantly enhancing the Suite Pass rate (SuitePS) by 18.03%-42.86% on average across both PLs with the auto-generated fixtures. Owing to the maintenance of test fixtures, Fixturize further improves line/branch coverage when integrated with existing testing tools of both LLM-based and Search-based by 16.85%/24.08% and 31.54%/119.66% on average, respectively. The findings establish fixture awareness as an essential, missing component in modern auto-testing pipelines.
Problem

Research questions and friction points this paper is trying to address.

test generation
test fixtures
LLM
unit testing
fixture dependency
Innovation

Methods, ideas, or system contributions that make the work stand out.

test fixture generation
LLM-based testing
fixture-aware test synthesis
automated unit testing
FixtureEval benchmark
🔎 Similar Papers
No similar papers found.
P
Pengyu Xue
Shandong University, China
Chengyi Wang
Chengyi Wang
Bytedance Inc
Large language model
Zhen Yang
Zhen Yang
Shandong University
Software EngineeringCode Understanding and GenerationProgramming Language Processing
Xiapu Luo
Xiapu Luo
The Hong Kong Polytechnic University
Mobile SecuritySmart ContractsNetwork SecurityBlockchainSoftware Engineering
Y
Yuxuan Zhang
Shandong University, China
X
Xiran Lyu
Shandong University, China
Y
Yifei Pei
Shandong University, China
Z
Zonghan Jia
Shandong University, China
Y
Yichen Sun
Henan University, China
L
Linhao Wu
Shandong University, China
K
Kunwu Zheng
Shandong University, China