Prompting LLMs for Code Editing: Struggles and Remedies

📅 2025-04-28

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This work addresses the high failure rate of LLM-driven code editing in IDEs due to insufficient user prompts. Through large-scale telemetry log analysis and qualitative coding, we systematically identify five critical categories of missing prompt information and characterize typical prompt failure patterns—marking the first empirical investigation of prompt inadequacy in IDE-based code generation. Building on these findings, we propose AutoPrompter, a context-aware automatic prompt completion method that dynamically augments users’ initial prompts with semantically relevant contextual signals (e.g., editor state, file structure, and recent edits). Experimental evaluation shows AutoPrompter improves task correctness by 27% and significantly reduces user retries. To our knowledge, this is the first systematic study that jointly combines real-world empirical analysis with automated prompt optimization specifically for the IDE setting. It establishes a reproducible methodology and empirical foundation for enhancing the practical utility of LLM-based coding assistants.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are rapidly transforming software engineering, with coding assistants embedded in an IDE becoming increasingly prevalent. While research has focused on improving the tools and understanding developer perceptions, a critical gap exists in understanding how developers actually use these tools in their daily workflows, and, crucially, where they struggle. This paper addresses part of this gap through a multi-phased investigation of developer interactions with an LLM-powered code editing and transformation feature, Transform Code, in an IDE widely used at Google. First, we analyze telemetry logs of the feature usage, revealing that frequent re-prompting can be an indicator of developer struggles with using Transform Code. Second, we conduct a qualitative analysis of unsatisfactory requests, identifying five key categories of information often missing from developer prompts. Finally, based on these findings, we propose and evaluate a tool, AutoPrompter, for automatically improving prompts by inferring missing information from the surrounding code context, leading to a 27% improvement in edit correctness on our test set.

Problem

Research questions and friction points this paper is trying to address.

Understanding developer struggles with LLM-powered code editing tools

Identifying missing information in developer prompts for code editing

Improving prompt effectiveness to enhance code edit correctness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyze telemetry logs to identify re-prompting struggles

Identify five key missing prompt information categories

Develop AutoPrompter to improve prompts automatically

🔎 Similar Papers

EPiC: Cost-effective Search-based Prompt Engineering of LLMs for Code Generation