Technique to Baseline QE Artefact Generation Aligned to Quality Metrics

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

This study addresses the uncontrolled quality of quality engineering (QE) artifacts—such as requirements specifications, test cases, and Behavior-Driven Development (BDD) scenarios—automatically generated by large language models (LLMs). We propose an iterative optimization framework integrating forward generation, backward generation, and rubric-guided scoring to enhance artifact quality along four dimensions: clarity, completeness, consistency, and testability. Our approach enables automated, quantitative, and reproducible quality assessment and improvement. Evaluated across 12 real-world projects, the method significantly improves output stability: it preserves high quality under high-quality inputs and substantially outperforms baselines under low-quality inputs. The core contribution is the first integration of backward generation with structured rubric-based guidance, establishing a closed-loop, artifact-centric quality enhancement paradigm for QE.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are transforming Quality Engineering (QE) by automating the generation of artefacts such as requirements, test cases, and Behavior Driven Development (BDD) scenarios. However, ensuring the quality of these outputs remains a challenge. This paper presents a systematic technique to baseline and evaluate QE artefacts using quantifiable metrics. The approach combines LLM-driven generation, reverse generation , and iterative refinement guided by rubrics technique for clarity, completeness, consistency, and testability. Experimental results across 12 projects show that reverse-generated artefacts can outperform low-quality inputs and maintain high standards when inputs are strong. The framework enables scalable, reliable QE artefact validation, bridging automation with accountability.

Problem

Research questions and friction points this paper is trying to address.

Establishing baselines for automated QE artefact quality evaluation

Ensuring generated requirements and test cases meet quality metrics

Validating LLM outputs through reverse generation and iterative refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic baselining technique for QE artefacts

Combines LLM generation with reverse generation

Iterative refinement guided by quantifiable quality metrics

🔎 Similar Papers

Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

2024-04-25arXiv.orgCitations: 8

Goldman Sachs

Dallas, TX, United States

Authors to Follow