PalimpChat: Declarative and Interactive AI analytics

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
To address the challenge non-expert users face in constructing complex AI analytics pipelines, this paper introduces the first natural language–driven, zero-code AI analytics system. Our approach deeply integrates the ReAct-based reasoning agent Archytas with the declarative AI framework Palimzest, enabling end-to-end pipeline creation, optimization, and interpretable execution via a conversational interface. Key technical contributions include: (1) a hybrid execution engine unifying relational operators with LLM-native operators; (2) an automated tuning mechanism grounded in declarative query optimization; and (3) dynamic execution scheduling coupled with semantic compilation from natural language to executable plans. Evaluated across biomedical, legal, and real-estate domains, the system substantially lowers usability barriers while maintaining high accuracy and real-time interactivity. The implementation—including core framework and extensible connectors for custom data sources—is open-sourced.

Technology Category

Application Category

📝 Abstract
Thanks to the advances in generative architectures and large language models, data scientists can now code pipelines of machine-learning operations to process large collections of unstructured data. Recent progress has seen the rise of declarative AI frameworks (e.g., Palimpzest, Lotus, and DocETL) to build optimized and increasingly complex pipelines, but these systems often remain accessible only to expert programmers. In this demonstration, we present PalimpChat, a chat-based interface to Palimpzest that bridges this gap by letting users create and run sophisticated AI pipelines through natural language alone. By integrating Archytas, a ReAct-based reasoning agent, and Palimpzest's suite of relational and LLM-based operators, PalimpChat provides a practical illustration of how a chat interface can make declarative AI frameworks truly accessible to non-experts. Our demo system is publicly available online. At SIGMOD'25, participants can explore three real-world scenarios--scientific discovery, legal discovery, and real estate search--or apply PalimpChat to their own datasets. In this paper, we focus on how PalimpChat, supported by the Palimpzest optimizer, simplifies complex AI workflows such as extracting and analyzing biomedical data.
Problem

Research questions and friction points this paper is trying to address.

Simplifying AI pipeline creation
Enhancing accessibility for non-experts
Integrating natural language interfaces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chat-based interface for AI pipelines
Natural language processing for non-experts
Integration of ReAct-based reasoning agent
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
Chunwei Liu
Chunwei Liu
Massachusetts Institute of Technology
DatabasesCompound AI SystemsLLMData CompressionIoT
Gerardo Vitagliano
Gerardo Vitagliano
CSAIL, Massachusetts Institute of Technology
data integrationdata preparationdata management for ML
B
Brandon Rose
Jataware, USA
M
Matt Prinz
Jataware, USA
D
David Andrew Samson
Jataware, USA
M
Michael J. Cafarella
MIT CSAIL, Cambridge, MA, USA