🤖 AI Summary
Existing chart understanding—encompassing parsing, question answering (QA), and generation—suffers from task fragmentation and poor semantic sharing across modalities.
Method: We propose the first schema-centric unified multitask framework enabling bidirectional consistency learning between generation and parsing. Our approach (1) introduces a generation→parsing→regeneration cycle-consistency objective; (2) designs a cross-task-aligned unified schema interface; and (3) releases Chart3D, the first strongly aligned three-task chart dataset. The model employs a Transformer-based cross-modal encoder-decoder, jointly trained on all tasks with explicit schema modeling and contrastive consistency loss.
Results: Our method achieves state-of-the-art performance on chart generation, parsing, and QA, while significantly improving cross-task generalization. Empirical results validate that bidirectional semantic alignment via schema-driven consistency substantially enhances holistic chart understanding.
📝 Abstract
Current chart-specific tasks, such as chart question answering, chart parsing, and chart generation, are typically studied in isolation, preventing models from learning the shared semantics that link chart generation and interpretation. We introduce CycleChart, a consistency-based learning framework for bidirectional chart understanding and generation. CycleChart adopts a schema-centric formulation as a common interface across tasks. We construct a consistent multi-task dataset, where each chart sample includes aligned annotations for schema prediction, data parsing, and question answering. To learn cross-directional chart semantics, CycleChart introduces a generate-parse consistency objective: the model generates a chart schema from a table and a textual query, then learns to recover the schema and data from the generated chart, enforcing semantic alignment across directions. CycleChart achieves strong results on chart generation, chart parsing, and chart question answering, demonstrating improved cross-task generalization and marking a step toward more general chart understanding models.