🤖 AI Summary
Existing BI systems employ LLM agents for isolated tasks (e.g., NL2SQL or NL2VIS), leading to fragmented data roles and inefficient cross-task collaboration.
Method: We propose the first unified LLM-powered intelligent platform for end-to-end enterprise BI, integrating multi-role collaborative capabilities across data preparation, analysis, and visualization. Our approach introduces three key innovations: (1) a domain-knowledge injection module, (2) a cross-agent communication mechanism, and (3) a cell-level context management strategy—seamlessly unifying natural language interaction with programmable computational notebooks.
Results: The platform achieves state-of-the-art performance on multiple benchmarks. On Tencent’s real-world BI dataset, it improves task accuracy by 58.58% and reduces token consumption by 61.65%, significantly enhancing both end-to-end efficiency and analytical fidelity.
📝 Abstract
Business intelligence (BI) transforms large volumes of data within modern organizations into actionable insights for informed decision-making. Recently, large language model (LLM)-based agents have streamlined the BI workflow by automatically performing task planning, reasoning, and actions in executable environments based on natural language (NL) queries. However, existing approaches primarily focus on individual BI tasks such as NL2SQL and NL2VIS. The fragmentation of tasks across different data roles and tools lead to inefficiencies and potential errors due to the iterative and collaborative nature of BI. In this paper, we introduce DataLab, a unified BI platform that integrates a one-stop LLM-based agent framework with an augmented computational notebook interface. DataLab supports various BI tasks for different data roles in data preparation, analysis, and visualization by seamlessly combining LLM assistance with user customization within a single environment. To achieve this unification, we design a domain knowledge incorporation module tailored for enterprise-specific BI tasks, an inter-agent communication mechanism to facilitate information sharing across the BI workflow, and a cell-based context management strategy to enhance context utilization efficiency in BI notebooks. Extensive experiments demonstrate that DataLab achieves state-of-the-art performance on various BI tasks across popular research benchmarks. Moreover, DataLab maintains high effectiveness and efficiency on real-world datasets from Tencent, achieving up to a 58.58% increase in accuracy and a 61.65% reduction in token cost on enterprise-specific BI tasks.