Interactive Agents to Overcome Ambiguity in Software Engineering

📅 2025-02-18

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the challenges of ambiguity handling in LLM-based agents for software engineering—specifically, erroneous inference, tool misuse, and resource inefficiency when processing vague user instructions. We propose the first evaluable framework that decouples ambiguity mitigation into three sequential, modular stages: ambiguity detection, proactive question generation, and interactive clarification. Empirical analysis reveals that state-of-the-art LLMs struggle to autonomously detect instruction ambiguity; however, structured questioning combined with multi-turn human collaboration substantially improves task success rates. Innovatively, we formalize the interaction process itself as a robustness-enhancing mechanism, establishing human-AI collaboration as a foundational paradigm for reliable agent behavior. We instantiate interactive agents using both proprietary and open-weight LLMs, and rigorously validate the framework’s efficacy in realistic code-generation tasks, demonstrating significant performance gains—particularly for complex, ambiguous engineering problems.

Technology Category

Application Category

📝 Abstract

AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions. Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes, safety risks due to tool misuse, and wasted computational resources. In this work, we study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance across three key steps: (a) leveraging interactivity to improve performance in ambiguous scenarios, (b) detecting ambiguity, and (c) asking targeted questions. Our findings reveal that models struggle to distinguish between well-specified and underspecified instructions. However, when models interact for underspecified inputs, they effectively obtain vital information from the user, leading to significant improvements in performance and underscoring the value of effective interaction. Our study highlights critical gaps in how current state-of-the-art models handle ambiguity in complex software engineering tasks and structures the evaluation into distinct steps to enable targeted improvements.

Problem

Research questions and friction points this paper is trying to address.

Handling ambiguous user instructions

Improving interactive code generation

Detecting and resolving underspecified inputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM agents handle ambiguity

Interactive code generation improves

Targeted questions enhance performance

🔎 Similar Papers

Motivations, Challenges, Best Practices, and Benefits for Bots and Conversational Agents in Software Engineering: A Multivocal Literature Review