PalimpChat: Declarative and Interactive AI analytics

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge non-expert users face in constructing complex AI analytics pipelines, this paper introduces the first natural language–driven, zero-code AI analytics system. Our approach deeply integrates the ReAct-based reasoning agent Archytas with the declarative AI framework Palimzest, enabling end-to-end pipeline creation, optimization, and interpretable execution via a conversational interface. Key technical contributions include: (1) a hybrid execution engine unifying relational operators with LLM-native operators; (2) an automated tuning mechanism grounded in declarative query optimization; and (3) dynamic execution scheduling coupled with semantic compilation from natural language to executable plans. Evaluated across biomedical, legal, and real-estate domains, the system substantially lowers usability barriers while maintaining high accuracy and real-time interactivity. The implementation—including core framework and extensible connectors for custom data sources—is open-sourced.

Technology Category

Application Category

📝 Abstract
Thanks to the advances in generative architectures and large language models, data scientists can now code pipelines of machine-learning operations to process large collections of unstructured data. Recent progress has seen the rise of declarative AI frameworks (e.g., Palimpzest, Lotus, and DocETL) to build optimized and increasingly complex pipelines, but these systems often remain accessible only to expert programmers. In this demonstration, we present PalimpChat, a chat-based interface to Palimpzest that bridges this gap by letting users create and run sophisticated AI pipelines through natural language alone. By integrating Archytas, a ReAct-based reasoning agent, and Palimpzest's suite of relational and LLM-based operators, PalimpChat provides a practical illustration of how a chat interface can make declarative AI frameworks truly accessible to non-experts. Our demo system is publicly available online. At SIGMOD'25, participants can explore three real-world scenarios--scientific discovery, legal discovery, and real estate search--or apply PalimpChat to their own datasets. In this paper, we focus on how PalimpChat, supported by the Palimpzest optimizer, simplifies complex AI workflows such as extracting and analyzing biomedical data.
Problem

Research questions and friction points this paper is trying to address.

Simplifying AI pipeline creation
Enhancing accessibility for non-experts
Integrating natural language interfaces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chat-based interface for AI pipelines
Natural language processing for non-experts
Integration of ReAct-based reasoning agent
🔎 Similar Papers
No similar papers found.
Chunwei Liu
Chunwei Liu
Massachusetts Institute of Technology
DatabasesCompound AI SystemsLLMData CompressionIoT
Gerardo Vitagliano
Gerardo Vitagliano
CSAIL, Massachusetts Institute of Technology
data integrationdata preparationdata management for ML
B
Brandon Rose
Jataware, USA
M
Matt Prinz
Jataware, USA
D
David Andrew Samson
Jataware, USA
M
Michael J. Cafarella
MIT CSAIL, Cambridge, MA, USA