Tasks People Prompt: A Taxonomy of LLM Downstream Tasks in Software Verification and Falsification Approaches

📅 2024-04-14

🏛️ arXiv.org

📈 Citations: 6

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Current LLM-native software engineering lacks a systematic practical framework—particularly in verification and falsification—necessitating unified task taxonomies and prompt-pattern conceptualizations. Method: We conduct a systematic literature review of over 100 papers, employing bibliometric analysis and conceptual clustering to map, classify, and abstract LLM-based downstream tasks in software engineering (SE). Contribution/Results: We propose the first fine-grained SE-specific taxonomy for LLM downstream tasks, encompassing six core clusters: testing, fuzzing, bug localization, vulnerability detection, static analysis, and program verification. Our taxonomy uniquely balances cross-task abstraction with task-specific variation modeling, uncovering generalizable prompt-engineering principles. It provides a foundational framework for targeted LLM adaptation, benchmark construction, and empirically grounded engineering practice in SE.

Technology Category

Application Category

📝 Abstract

Prompting has become one of the main approaches to leverage emergent capabilities of Large Language Models [Brown et al. NeurIPS 2020, Wei et al. TMLR 2022, Wei et al. NeurIPS 2022]. Recently, researchers and practitioners have been"playing"with prompts (e.g., In-Context Learning) to see how to make the most of pre-trained Language Models. By homogeneously dissecting more than a hundred articles, we investigate how software testing and verification research communities have leveraged LLMs capabilities. First, we validate that downstream tasks are adequate to convey a nontrivial modular blueprint of prompt-based proposals in scope. Moreover, we name and classify the concrete downstream tasks we recover in both validation research papers and solution proposals. In order to perform classification, mapping, and analysis, we also develop a novel downstream-task taxonomy. The main taxonomy requirement is to highlight commonalities while exhibiting variation points of task types that enable pinpointing emerging patterns in a varied spectrum of Software Engineering problems that encompasses testing, fuzzing, fault localization, vulnerability detection, static analysis, and program verification approaches. Avenues for future research are also discussed based on conceptual clusters induced by the taxonomy.

Problem

Research questions and friction points this paper is trying to address.

Developing conceptual frameworks for LLM-native software engineering practices

Systematically analyzing generative transformations in software verification

Identifying compositional patterns for reliable LLM-native system design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Taxonomy of generative transformations from prompt interactions

Identified transformation relationship patterns like design patterns

Structured foundation for modular LLM application design

🔎 Similar Papers

Get my drift? Catching LLM Task Drift with Activation Deltas