🤖 AI Summary
This paper introduces the novel paradigm of “empirical computation,” which defines a class of computational capabilities grounded in empirical probability rather than formal correctness, and supports natural language and unstructured inputs. Methodologically, it establishes the first systematic theoretical framework for empirical computation, challenging foundational assumptions of the Turing machine model and classical complexity theory; it leverages zero-shot and few-shot reasoning with large language models, integrated with uncertainty modeling and empirical evaluation—requiring neither formal specifications nor domain-specific programming. Contributions include: (1) formally establishing empirical computation as a legitimate, independent research direction within software engineering; (2) demonstrating that its problem-solving performance does not conform to traditional asymptotic complexity constraints (e.g., O(n log n)); and (3) empirically revealing new patterns in solution capability, timeliness, and input robustness—thereby pioneering an evidence-driven foundation for computational theory.
📝 Abstract
In this vision paper, we explore the challenges and opportunities of a form of computation that employs an empirical (rather than a formal) approach, where the solution of a computational problem is returned as empirically most likely (rather than necessarily correct). We call this approach as *empirical computation* and observe that its capabilities and limits *cannot* be understood within the classic, rationalist framework of computation. While we take a very broad view of"computational problem", a classic, well-studied example is *sorting*: Given a set of $n$ numbers, return these numbers sorted in ascending order. * To run a classical, *formal computation*, we might first think about a *specific algorithm* (e.g., merge sort) before developing a *specific* program that implements it. The program will expect the input to be given in a *specific* format, type, or data structure (e.g., unsigned 32-bit integers). In software engineering, we have many approaches to analyze the correctness of such programs. From complexity theory, we know that there exists no correct program that can solve the average instance of the sorting problem faster than $O(nlog n)$. * To run an *empirical computation*, we might directly ask a large language model (LLM) to solve *any* computational problem (which can be stated informally in natural language) and provide the input in *any* format (e.g., negative numbers written as Chinese characters). There is no (problem-specific) program that could be analyzed for correctness. Also, the time it takes an LLM to return an answer is entirely *independent* of the computational complexity of the problem that is solved. What are the capabilities or limits of empirical computation in the general, in the problem-, or in the instance-specific? Our purpose is to establish empirical computation as a field in SE that is timely and rich with interesting problems.