An Empirical Study of the Role of Incompleteness and Ambiguity in Interactions with Large Language Models

📅 2025-03-23

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This study investigates the root causes of multi-turn interaction necessity in LLM-based question answering, focusing on two dynamic problem attributes: incompleteness and ambiguity. We propose the first neuro-symbolic framework to automatically infer these attributes from interaction logs and establish their causal relationships with required interaction turns and answer correctness. Our key contributions are threefold: (1) We formally define incompleteness and ambiguity as computable, evolution-aware, interaction-driven properties—not static question features; (2) Using a controllably constructed benchmark dataset and human-annotated experiments, we empirically validate that high incompleteness or ambiguity significantly increases turn requirements, while effective interaction systematically reduces both attributes; (3) Our metrics accurately characterize and predict LLM QA performance, providing both theoretical grounding and practical tools for interpretable human-AI collaboration.

Technology Category

Application Category

📝 Abstract

Natural language as a medium for human-computer interaction has long been anticipated, has been undergoing a sea-change with the advent of Large Language Models (LLMs) with startling capacities for processing and generating language. Many of us now treat LLMs as modern-day oracles, asking it almost any kind of question. Unlike its Delphic predecessor, consulting an LLM does not have to be a single-turn activity (ask a question, receive an answer, leave); and -- also unlike the Pythia -- it is widely acknowledged that answers from LLMs can be improved with additional context. In this paper, we aim to study when we need multi-turn interactions with LLMs to successfully get a question answered; or conclude that a question is unanswerable. We present a neural symbolic framework that models the interactions between human and LLM agents. Through the proposed framework, we define incompleteness and ambiguity in the questions as properties deducible from the messages exchanged in the interaction, and provide results from benchmark problems, in which the answer-correctness is shown to depend on whether or not questions demonstrate the presence of incompleteness or ambiguity (according to the properties we identify). Our results show multi-turn interactions are usually required for datasets which have a high proportion of incompleteness or ambiguous questions; and that that increasing interaction length has the effect of reducing incompleteness or ambiguity. The results also suggest that our measures of incompleteness and ambiguity can be useful tools for characterising interactions with an LLM on question-answeringproblems

Problem

Research questions and friction points this paper is trying to address.

Study multi-turn interactions with LLMs for question answering

Define incompleteness and ambiguity in questions via interactions

Measure impact of interaction length on reducing incompleteness/ambiguity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural symbolic framework models human-LLM interactions

Defines incompleteness and ambiguity in questions

Multi-turn interactions reduce incompleteness and ambiguity

🔎 Similar Papers

Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?