๐ค AI Summary
This work proposes a general framework that addresses the limitations of existing meta-learning regression methods, which struggle to effectively integrate expert priors with textual metadata, and the constrained regression performance of large language models (LLMs) despite their rich semantic capabilities. The approach uniquely combines the text-conditional probability outputs of LLMs with neural diffusion or flow-matching processes through a product-of-experts mechanism, enabling joint sampling between a binning-based probabilistic density โexpertโ model and a diffusion generative model. By doing so, the method preserves the semantic understanding of LLMs while significantly enhancing regression accuracy. Empirical evaluations across multiple benchmark tasks demonstrate consistent superiority over both standalone LLMs and neural processโbased approaches, validating the effective contribution of text-conditioned knowledge to regression performance.
๐ Abstract
Meta-learning methods for regression like Neural (Diffusion) Processes achieve impressive results, but with these models it can be difficult to incorporate expert prior knowledge and information contained in metadata. Large Language Models (LLMs) are trained on giant corpora including varied real-world regression datasets alongside their descriptions and metadata, leading to impressive performance on a range of downstream tasks. Recent work has extended this to regression tasks and is able to leverage such prior knowledge and metadata, achieving surprisingly good performance, but this still rarely matches dedicated meta-learning methods. Here we introduce a general method for sampling from a product-of-experts of a diffusion or flow matching model and an `expert'with binned probability density; we apply this to combine neural diffusion processes with LLM token probabilities for regression (which may incorporate textual knowledge), exceeding the empirical performance of either alone.