LAMBDA: A Large Model Based Data Agent

📅 2024-07-24
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Data-driven analysis remains inaccessible to cross-domain users lacking programming expertise. Method: This paper introduces an open-source, no-code multi-agent analytical system featuring a novel programmer–inspector dual-agent architecture. It enables iterative, natural-language-driven code generation, execution, and debugging, while supporting real-time user intervention and plug-and-play integration of external knowledge sources. The approach synergistically combines large language models’ code-generation capabilities, a multi-agent collaboration framework, and an interactive natural-language interface. Contribution/Results: Experiments across diverse real-world analytical tasks demonstrate significant improvements in both accuracy and efficiency over baseline tools. The system eliminates reliance on user programming proficiency, enhances robustness through iterative validation, and broadens cross-domain accessibility. It establishes a new paradigm for low-barrier, high-reliability human–AI collaborative data analysis.

Technology Category

Application Category

📝 Abstract
We introduce LArge Model Based Data Agent (LAMBDA), a novel open-source, code-free multi-agent data analysis system that leverages the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the programmer and the inspector, which are engineered to work together seamlessly. Specifically, the programmer generates code based on the user's instructions and domain-specific knowledge, enhanced by advanced models. Meanwhile, the inspector debugs the code when necessary. To ensure robustness and handle adverse scenarios, LAMBDA features a user interface that allows direct user intervention in the operational loop. Additionally, LAMBDA can flexibly integrate external models and algorithms through our proposed Knowledge Integration Mechanism, catering to the needs of customized data analysis. LAMBDA has demonstrated strong performance on various data analysis tasks. It has the potential to enhance data analysis paradigms by seamlessly integrating human and artificial intelligence, making it more accessible, effective, and efficient for users from diverse backgrounds. The strong performance of LAMBDA in solving data analysis problems is demonstrated using real-world data examples. Videos of several case studies are available at https://xxxlambda.github.io/lambda_webpage.
Problem

Research questions and friction points this paper is trying to address.

Develops a code-free multi-agent system for data analysis
Integrates large language models to automate coding and debugging
Enables customizable data analysis with external model integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Code-free multi-agent system using LLMs
Programmer and inspector agent collaboration
Flexible external model integration mechanism
M
Maojun Sun
Department of Applied Mathematics, The Hong Kong Polytechnic University
R
Ruijian Han
Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University
Binyan Jiang
Binyan Jiang
The Hong Kong Polytechnic University
Statistics
Houduo Qi
Houduo Qi
Professor, DSAI and AMA, The Hong Kong Polytechnic University
Mathematical OptimizationOperations Research
D
Defeng Sun
Department of Applied Mathematics, The Hong Kong Polytechnic University
Yancheng Yuan
Yancheng Yuan
Assistant Professor, The Hong Kong Polytechnic University
Optimization AlgorithmsMachine Learning
J
Jian Huang
Department of Applied Mathematics, The Hong Kong Polytechnic University; Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University