Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets

📅 2025-08-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates large language models (LLMs) as universal function approximators for few-shot prediction on small-scale structured data across classification, regression, and clustering tasks. Method: We benchmark state-of-the-art LLMs—including GPT-4o, GPT-5, and Gemini 2.5 Flash—using in-context learning and structured prompting, and compare their performance against traditional machine learning and table-specific models. Contribution/Results: LLMs achieve strong, zero-shot classification performance, serving as competitive baselines without fine-tuning and substantially lowering the modeling barrier for business intelligence and exploratory analysis. In contrast, their regression and clustering performance is markedly inferior, primarily due to the absence of explicit mechanisms for modeling continuous output spaces and intrinsic cluster structure. To our knowledge, this work is the first to empirically delineate the “strong classification, weak regression/clustering” capability boundary of LLMs on structured data, providing foundational evidence and practical guidance for LLM-driven data science.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs), originally developed for natural language processing (NLP), have demonstrated the potential to generalize across modalities and domains. With their in-context learning (ICL) capabilities, LLMs can perform predictive tasks over structured inputs without explicit fine-tuning on downstream tasks. In this work, we investigate the empirical function approximation capability of LLMs on small-scale structured datasets for classification, regression and clustering tasks. We evaluate the performance of state-of-the-art LLMs (GPT-5, GPT-4o, GPT-o3, Gemini-2.5-Flash, DeepSeek-R1) under few-shot prompting and compare them against established machine learning (ML) baselines, including linear models, ensemble methods and tabular foundation models (TFMs). Our results show that LLMs achieve strong performance in classification tasks under limited data availability, establishing practical zero-training baselines. In contrast, the performance in regression with continuous-valued outputs is poor compared to ML models, likely because regression demands outputs in a large (often infinite) space, and clustering results are similarly limited, which we attribute to the absence of genuine ICL in this setting. Nonetheless, this approach enables rapid, low-overhead data exploration and offers a viable alternative to traditional ML pipelines in business intelligence and exploratory analytics contexts. We further analyze the influence of context size and prompt structure on approximation quality, identifying trade-offs that affect predictive performance. Our findings suggest that LLMs can serve as general-purpose predictive engines for structured data, with clear strengths in classification and significant limitations in regression and clustering.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' function approximation on small tabular datasets
Comparing LLM performance against traditional ML baselines
Assessing LLM capabilities in classification, regression, and clustering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs for structured data without fine-tuning
Evaluating LLMs against traditional ML baselines
Analyzing context size and prompt structure impact
🔎 Similar Papers
No similar papers found.
N
Nikolaos Pavlidis
Institute for Language and Speech Processing, Athena Research Center, 67100 Xanthi, Greece; Department of Electrical and Computer Engineering, Democritus University of Thrace, Kimmeria, 67100 Xanthi, Greece
V
Vasilis Perifanis
Institute for Language and Speech Processing, Athena Research Center, 67100 Xanthi, Greece; Department of Electrical and Computer Engineering, Democritus University of Thrace, Kimmeria, 67100 Xanthi, Greece
Symeon Symeonidis
Symeon Symeonidis
Democritus University of Thrace, ENORA Innovation
Sentiment MiningIntention MiningData AnalyticsBusiness IntelligenceArtificial Intelligence
Pavlos S. Efraimidis
Pavlos S. Efraimidis
Professor, ECE, Democritus University of Thrace and affiliated member of Athena RC
AlgorithmsFederated Machine LearningPrivacySocial Network AnalysisAlgorithmic Game Theory