ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Existing models struggle to simultaneously ensure data fidelity and aesthetic quality in creative table visualization, particularly lacking deep reasoning, structured planning, and precise data-to-visual mapping capabilities. To address this, we propose a progressive self-correcting pipeline that synergistically integrates multimodal large language models (MLLMs) and diffusion models: the MLLM performs multi-step reasoning, error diagnosis, and reflective planning, while the diffusion model executes high-fidelity visual generation—forming a closed “reason-diagnose-correct-generate” loop. We formally define the novel task of *creative table visualization*, introduce the *reflective refinement* paradigm, and present TableVisBench—the first dedicated benchmark comprising 800 diverse instances with five-dimensional evaluation criteria—as well as a three-stage automated data construction pipeline. Our method achieves significant improvements over all baselines on TableVisBench, demonstrating superior data comprehension, cross-modal reasoning, and iterative error correction.

Technology Category

Application Category

📝 Abstract

While existing generation and unified models excel at general image generation, they struggle with tasks requiring deep reasoning, planning, and precise data-to-visual mapping abilities beyond general scenarios. To push beyond the existing limitations, we introduce a new and challenging task: creative table visualization, requiring the model to generate an infographic that faithfully and aesthetically visualizes the data from a given table. To address this challenge, we propose ShowTable, a pipeline that synergizes MLLMs with diffusion models via a progressive self-correcting process. The MLLM acts as the central orchestrator for reasoning the visual plan and judging visual errors to provide refined instructions, the diffusion execute the commands from MLLM, achieving high-fidelity results. To support this task and our pipeline, we introduce three automated data construction pipelines for training different modules. Furthermore, we introduce TableVisBench, a new benchmark with 800 challenging instances across 5 evaluation dimensions, to assess performance on this task. Experiments demonstrate that our pipeline, instantiated with different models, significantly outperforms baselines, highlighting its effective multi-modal reasoning, generation, and error correction capabilities.

Problem

Research questions and friction points this paper is trying to address.

Generates creative table visualizations from data

Addresses deep reasoning and precise data-to-visual mapping

Synergizes MLLMs with diffusion models for high-fidelity results

Innovation

Methods, ideas, or system contributions that make the work stand out.

MLLM orchestrates reasoning and error correction

Diffusion models execute high-fidelity visual generation

Progressive self-correcting process synergizes both models

🔎 Similar Papers

LADICA: A Large Shared Display Interface for Generative AI Cognitive Assistance in Co-Located Team Collaboration