🤖 AI Summary
This work addresses two key challenges in machine learning: poor model interpretability and human-dependent model class selection. To this end, we propose Verbalized Machine Learning (VML), a novel paradigm that explicitly constrains the model parameter space to natural language prompts and employs large language models (LLMs) as differentiable/optimizable function approximators and meta-optimizers. VML abandons conventional numerical parameters, instead performing learning—including regression and classification—and iterative optimization directly within the natural language space. It supports textual encoding of prior knowledge and gradient-free structural adaptation. Experiments on standard benchmarks demonstrate that VML maintains competitive predictive performance while substantially enhancing model transparency and interpretability. Crucially, VML enables, for the first time, semantic-driven automatic model class selection—where the optimal model architecture is induced implicitly through prompt semantics rather than manual specification.
📝 Abstract
Motivated by the progress made by large language models (LLMs), we introduce the framework of verbalized machine learning (VML). In contrast to conventional machine learning (ML) models that are typically optimized over a continuous parameter space, VML constrains the parameter space to be human-interpretable natural language. Such a constraint leads to a new perspective of function approximation, where an LLM with a text prompt can be viewed as a function parameterized by the text prompt. Guided by this perspective, we revisit classical ML problems, such as regression and classification, and find that these problems can be solved by an LLM-parameterized learner and optimizer. The major advantages of VML include (1) easy encoding of inductive bias: prior knowledge about the problem and hypothesis class can be encoded in natural language and fed into the LLM-parameterized learner; (2) automatic model class selection: the optimizer can automatically select a model class based on data and verbalized prior knowledge, and it can update the model class during training; and (3) interpretable learner updates: the LLM-parameterized optimizer can provide explanations for why an update is performed. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.