Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks

πŸ“… 2026-04-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

189K/year
πŸ€– AI Summary
Existing table embedding methods lack a unified benchmark, making effective cross-task and cross-domain comparisons challenging. To address this gap, this work proposes TEmBedβ€”the first comprehensive evaluation framework for table embeddings that systematically covers four representation granularities: cell, row, column, and full-table levels. Under a consistent experimental setup, the study conducts a systematic assessment of multiple representative models across diverse tasks and domains. The empirical results reveal significant performance variations among existing approaches at different granularities and tasks, offering practical guidance for model selection in real-world applications and advancing the development of general-purpose table representation learning.

Technology Category

Application Category

πŸ“ Abstract
Tabular foundation models aim to learn universal representations of tabular data that transfer across tasks and domains, enabling applications such as table retrieval, semantic search and table-based prediction. Despite the growing number of such models, it remains unclear which approach works best in practice, as existing methods are often evaluated under task-specific settings that make direct comparison difficult. To address this, we introduce TEmBed, the Tabular Embedding Test Bed, a comprehensive benchmark for systematically evaluating tabular embeddings across four representation levels: cell, row, column, and table. Evaluating a diverse set of tabular representation learning models, we show that which model to use depends on the task and representation level. Our results offer practical guidance for selecting tabular embeddings in real-world applications and lay the groundwork for developing more general-purpose tabular representation models.
Problem

Research questions and friction points this paper is trying to address.

tabular embeddings
foundation models
benchmark
representation learning
table retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

tabular embeddings
foundation models
benchmark
representation learning
TEmBed
πŸ”Ž Similar Papers
No similar papers found.